1,A1W)K  VroKY  FOR 
COMl'l  1 l.R  SCIFNCF. 


\1ASSA(  III  Sl  l IS 

iNs  rm  Ti.  oi 
I r.niNOl.oiA 


I\l1rr/I,cs/TR-I84 

FACILITATING 
INTERPROCESS 
COMMUNICATION  . . 
IN  A HETEROGENEOUS  ^ 
NETWORK  ENVIRONMENT 

Paul  H.  Levine 

Phis  research  was  supported  by  the  Advanced 
Research  Projects  Agency  of  the  Department 
of  Defense  and  was  monitored  by  the  Office 
of  Naval  Research  under  Contract  No.  N00014-75-C-0661 


DISTRIBUTION  STATEMENT  A 
Appioved  loi  public  release; 


54S  IK  MNOlXn.Y  SOU  AKI  . C XMHRIIM.I  . M \SS  \(  MUSI  I I S 021 


i 


rJ 


SECUBITV  CLASSIFICATION  OF  THIS  PAGE  (TWlAn  Omit  Bnffd) 


REPORT  DOCUMENTATION  PAGE 

TTFFJ.B'T'nUI-BPB 

MIT/LCS/TR-184 


: GOVT  ACCESSION  NO 


* T|T|_e  Subinim) 


Facilitating  Interprocess  Connnunication  in 
Heterogeneous  Network  Environment  , , - 


• I 

±11 


Paul 


H.y  Levine  j 


0 


^TWFCTflTr^fW?TN^nW^5?^nf!I^CNirTDo«css 

MIT/Laboratory  for  Computer  Science 
545  Technology  Square 
Cambridge,  Ma  02139 


CONTROLLING  OFFICE  NAME  AND  ADDRESS 

Advanced  Research  Projects  Agency 
Department  of  Defense 
1400  Wilson  Boulevard 
Arlington.  Virginia  222^ 
DifTininfSfti?vNAtre«TnyDREsi 


Office  of  Naval  Research 
Department  of  the  Navy 
Information  Systems  Program 
Arlington,  Virginia  22217 

'^'iryiTL^iaN  {Tf^gapN?  ~ ~ 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


3 RFCIPIENT’S  CATAcDG  NUMBFR 


5 TVPE  OF  REPORT  A PERIOD  COVERED 

S.B.  & S.M.  Thesis 
March  25,  1977 


« PERFORMING  ORG.  REPORT  NUMRER 


■mit7lcs/tr-184' 


I contract  OR  gran’'  NUMSERi-tl 


N00014-75-C-0661 / 


To  PKOGAAM  element  PROJEC*^,  task 
AREA  A WORK  UNIT 


U.  REPORT  OATS 

July  1977 


IJ  NUMBER  OF  PAGES 

107 


IS  security  Class,  (ot  Ihim  rmpartj 
Unclassif led 


IS«.  OECL  ASSIFICAFION  DOWNGRADING 
SCHEDULE 


Approved  for  public  release;  distribution  unlimited 


D 


D ^ 


TsT7^ljTi'oH^^l^^kliEl^^e^h^Tb7trMC^t^t0r7<^^7/oTir2O^^f^!/7f7r0nnroff!''R0^ 


- r-  ; ..-mV 


//■ 


* / 


IS  SuP^L CMCN r ARY  NQTCS 


i' 


'9  I'  CY  words  fCofyftnu0  on  »ld0  U n«c«««wry  And  Identify  by  block  numbmr) 


Passing  information  among  processors  with  different  internal  data 
formatting  schemes  has  proven  to  be  a major  complication  to  computer  net- 
working efforts.  Data  format  translation  is  necessary  to  support  information, 
exchange  in  a heterogeneous  network  environment .Three  strategies  for  per- 
forming this  translation  for  communications  between  a message  sender  and 
receiver  are:  translation  by  the  receiver,  translation  by  an  Intermediate 
translator,  and  the  use  of  a standard  intermediate  format.  The  standard  forma] 
^^Isshowj^t^b^themos^^^jesgonsW^^^o^^^^et^^^^^ene^^^neWori^dc^^^^ii^^^^^ 


OD 


I JAN  TJ 


1472 


EDITION  OF  I NOV  «8  IS  OBSOLETE 
S/N  OlOJ-014-  »A01 


6^. 

IITV  CtASSiPiCAT 


■/ 


ftCCUfflTV  Ct 


ATION  OP  This  PAOC  Dotm  BnfOfOdj 


/ 


C I HI  TV  classification  OF  THIS  P»GefT>«w  Dmtm  gnlT»a) 


20.  J principles. 

The  Implementation  of  uti  Intermediate  format  based  Interprocesa 
communications  scheme  requires  a mechanism  for  passing  the  semantic 
description  of  each  string  of  data  hits.  Two  alt,ernatlve  merhanlsms  lor 
passing  this  information  are  discussed,  and  data *^tagglng"  is  selected  as 
the  more  flexible.  Other  Implementation  considerations  are  examined, 
including  possible  problems  in  performing  translation  and  the  relationship 
formal  translation  has  to  other  network  message  handling  functions. 

\ 


ttCuNITV  CLAUIFICATION  OF  THIS  PAOC(m«i  Dal* 


• A 


MIT/LCS/TR-18A 


FACILITATING  INTERPROCESS  COMMUNICATION 
IN  A 

HETEROGENEOUS  NETWORK  ENVIRONMENT 


Paul  Howard  Levine 
July  1977 


Publication  of  this  report  was  sponsored  by  the  Computer  Systems  Research 
Division  of  the  Laboratory  for  Computer  Science,  an  M.I.T.  Interdepartmental 
Laboratory  and  was  supported  in  part  by  the  Advanced  Research  Projects 
Agency  (ARPA)  of  the  Department  of  Defense  and  was  monitored  by  the  Office 
of  Naval  Research  under  contract  No.  N00014-75-C-0661 . 

This  research  was  funded  in  part  by  Naval  Underwater  Systems  Center  IRAD 
funds,  project  number  A75030,  W,  A.  Clearwaters  principal  investigator. 


MASSACHUSETTS  INSTITUTE  OF  TECHNOLOGY 
LABORATORY  FOR  COMPUTER  SCIENCE 


FACILITATING  INTERPROCESS  COMiMUNICATION 
IN  A 

HETEROGENEOUS  NETWORK  ENVIRONMENT  * 


by 

Paul  Howard  Levine 


ABSTRACT 


Passing  information  among  processors  with  different  internal  data 
formatting  schemes  has  proven  to  be  a major  complication  to  computer 
networking  efforts.  Data  format  translation  is  necessary  to  support 
information  exchange  in  a heterogeneous  network  environment.  Three 
strategies  for  performing  this  translation  for  communications  between 
a message  sender  and  receiver  are:  translation  by  the  receiver, 

translation  by  an  intermediate  translator,  and  the  use  of  a standard 
intermediate  format.  The  standard  format  is  shown  to  be  the  most 
responsive  to  a set  of  general  network  design  principles. 

The  implementation  of  an  intermediate  format  based  interprocess 
communications  scheme  requires  a mechanism  for  passing  the  semantic 
description  of  each  string  of  data  bits.  Two  alternative  mechanisms 
for  passing  this  information  are  discussed,  and  data  "tagging"  is 
selected  as  the  more  flexible.  Other  implementation  considerations 
are  examined,  including  possible  problems  in  performing  translation 
and  the  relationship  formal  translation  has  to  other  network  message 
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CHAPTER  I 


Hftorogtnooijs  Computot  Networks 

A recent  trend  In  computer  systems  research  has  been  towards  the 
investigation  of  and  experimentation  with  computer  netwrrrks.  Resides  the 
extensive  work  on  ARPANET  <Frank,  Heart,  Metcalfel,  Crocket,  ARPA>  and 
other  geographically  distributed  computer  networks  <Pouzinl,  Wood> , 
the  possible  implementations  and  applications  of  local  computet  networks  is 
also  being  researched  at  an  ever-increasing  number  of  laboratories 
across  the  country  <Fraserl,  Fatberl,  Metcalfe2,  Mills,  Binder,  MRG, 
Hitt,  Chen,  Wulf,  Swan> . Passing  information  among  processors  with 
different  internal  data  formats  has  proven  to  he  a major  complication 
to  these  computer  networking  efforts  <Farber2,  Millstein,  VanDam2>. 

1 . 1 Networking 

The  definition  of  a computer  network  can  be  phrased  in  terms  of  a 
network's  form  and  function.  One  such  definition  assorts  that 

...a  computer  network  is  defined  to  be  a set  of 

autonomous,  independent  computer  systems  (the  form]. 

Interconnected  so  as  to  permit  Interactive  resource 
sharing  between  any  pair  of  systems  [the  function]. 
[<Roberts>  p.  5A3.] 


,v , 


Heterogeneous  Computet  Netv/rirks 


Such  a network  is  embodied  as: 

1)  a collection  of  hosts  (computers)  providing  service  to 
a user  (either  an  end  user  or  another  host  computet),  and 

-)  a subnetwork  providing  communication  among  host 
computers,  users,  or  both.  [<Kimbl eton2>  p.  129.] 

A typical  network  is  depicted  in  Figure  l-l.  The  subnetwork  is  built  from 
nodes  and  the  communications  links  that  serve  as  the  data  paths  between 

the  nodes.  The  nodes  Interface  the  network  hosts  to  the  subnetwork 
<Ctowther>.  As  shown,  it  may  be  possible  for  a single  node  to  support  the 
netvnrk  demands  of  more  than  one  host. 

Study  of  techniques  for  supporting  general  inter -computer 
information  exchange  is  motivated  by  the  proposed  uses  of  computer 
networks.  ITie  most  often  cited  rationale  for  computet  networking  is 
probably  the  facilitation  of  "resource  sharing."  <Chen , Father  1, 
Mills,  Roberts,  Thoraas2>  However,  especially  in  the  case  of 
geographically  local  networks,  much  attention  is  now  being  focused  on 
computet  networks  as  the  hard  war e/ f 1 tmwat e base  for  distributed 
systems  <Kimbletonl,  MRG,  Rowe,  Swan,  'niomasl>. 
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TYPICAL  COMPUTER  NETWORK 
FIGURE  1-1. 

Resource  sharing  networks  are  those  computet  communications 
systems  that  provide  access  to  remote  hardware  and  software  services. 
This  may  include  the  use  of  standard  peripherals,  special  hardware 

devices,  information  or  software  utilities  through  the  network.  Tlie 
advantages  are  primarily  economic.  With  the  cost  of  the  processing  unit 
becoming  a smaller  percentage  of  the  total  system  cost  for  an 

installation,  concern  has  shifted  to  the  cost  of  providing  system 

services.  By  increasing  access  to  high  cost  software,  large  data 

bases,  or  expensive  peripherals,  tlie  need  for  redundant  facilities  can  be 
m in  Im  1 zed  . 

A distributed  computet  network  has  been  described  as  a system  that 
supports  the  execution  of  a user  task  by  using  multiple  components 
throughout  the  network,  each  component  performing  some  part  of  the  requited 
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task  <VanI)anil,  Wecki’t"'.  The  siibtasks  communicate  over  the  netwotk  !o 
accompl Ish  the  complete  asslRnraent.  The  principle  distinction  between 

this  and  a resource  sharing  netwr>tk  is  that  a distributed  system 
offers  the  end  user  an  interface  to  a single  coherent  system  and  yet 
employs  a netwotk  of  computers  to  process  his  request  <E!ovit7.,  Knslow>. 
Netvxirks  supporting  distributed  systems  can  transparently  offer  a user 
the  performance  advantages  of  load  sharing  and  parallel  processing  as 
'*^■11  as  the  reliability  feature  of  hardware  modularity  and  modular 
redundancy. 


1.2  Interprocess  communication 


The  transfer  of  information  between  computers  in  a netwrirk  can 
accurately  be  described  as  data  exchange  between  distinct  processes 
active  on  different  processors.  This  view  is  a natural  one  for  network 

based  distributed  systems.  One  model  of  such  a system  consists  of 

several  procedures  for  each  task,  tunning  on  different  processors  and 

performing  the  required  interprocess  communications  across  the 

netwuk.  However,  viewing  netwotk  message  passing  as  a case  of 

interprocess  communications  is  also  appropriate  for  resource  sharing 
ne  t wn  r ks  . 


It  Is  useful  to  think  of  resources  as  being 

associated  with  processes  and  available  only  through 
communication  with  tirese  processes.  Tills  is  a viewpoint 
that  has  been  successfully  applied  to  time-sharing  systems 
and  has  been  mote  recently  been  suggested  to  be  an 
appropriate  view  for  computer  networks.  Consistent  with  this 
view,  the  fundamental  problem  of  resource  sharing 

is, ..the  problem  of  interprocess  commun  icat  ion  ....  Tire  view 
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is  also  ticld  ttiat  interprocess  communication  over  a network 
is  a subcase  of  general  interprocess  commun icat ion  in 
■I  mill  t i pr  og  r ammed  environment.  [<Walden>  pp  . 

221-222.1 


Considering  the  messages  passed  between  network  liosts  to  be 
instances  of  interprocess  communication  provides  insight  into  the 
mechanisms  needed  to  support  inter-host  netwcik  communications. 

Specifically,  any  message  passing  scheme  must  support  the  transfer  of  the 
kinds  of  messages  that  are  the  units  of  communication  betwr-en 

processes.  Communicating  processes  may  need  to  exchange  only  boolean 
values  or  entire  data  files.  The  ALGOL-like  languages  allow  the 

interprocess  exchange  of  the  primitive  data  types  (INTKGER,  CHARACTER)  as 
well  as  mote  complex  structures  (STRINGS,  ARRAYS).  To  facilitate 

inter-host  communications,  tlien,  a network  message  passing  strategy  must 
support  the  transfer  of  both  simple  and  composite  data  types. 


Tire  problem  of  passing  information  across  a network  can  be  broken  down 
into  two  stages.  First,  regardless  of  tire  information  being  passed,  a 
protocol  must  be  established  that  assures  the  bit  integrity  of  exchanged 
messages.  Sciiemes  for  this  level  of  hand-shaking  itsually  employ  a three 
part  structure.  Including  a header,  the  data  bits  to  be  passed  and  a 
trailer  (Figure  1-2). 


data  field 


Figure  1-2. 


f. . 


!('  - 
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header 


trailer 
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Tho  hondet  contains  a destination  field  as  well  as  some,  possibly 

complex,  message  control  information.  The  data  field  is  usually 
transparent  to  the  message  passing  hardware  and  protocols.  The 

trailer  contains  the  error-checking  codes  and  status  information. 

This  level  of  inter  processor  communication  has  been  examined 

extensively  in  the  literature  <Bhush.in,  Metcalfe2,  WlilteC>  and  is  not 
addressed  in  this  study. 

The  second  stage  of  information  transfer  over  a network  is  the 

interpretation  of  the  bits  in  the  data  field.  Because  the  internal 

representation  of  data  is  different  across  products  of  different 

computet  manufacturers  and  even  computer  products  from  the  same 
manufacturer,  some  reformatting  of  the  Information  is  necessary  to 
support  information  transfer  in  networks. 


1 . '3  Heterogeneity 

Little  has  emerged  in  the  way  of  techniques  for  allowing 
different  kinds  of  processors  in  a heterogeneous  computing  environment  to 
exchange  information  in  a general  way.  R.ither  than  concentrating  on  the 
semantic  content  of  interprocessor  messages,  much  of  the  effort  has  been 
directed  towards  simply  getting  one  host  computet  in  a netv^ork  to 

accept  unexaralned  binary  data  from  .mother.  To  this  end,  several 

topologies  for  computer  interconnection  have  appeared,  as  well  as  schemes 
for  insuring  delivery  of  a binary  p.jcket  from  a sending  host  to  its 
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Intendf'd  riTcivot.  Yot  to  bo  addressed  Is  the  problem  of  also 
transmitting  the  semantic  content  or  "meaning"  of  tin-  bits  in  a general 
way.  Passing  tlie  bits  tiiomselves  is  only  the  first  sir-p  towarcis 

inter  pr  ocossor  communications.  At  tliis  point,  a mc'clianism  is  needed  to 
support  the  passing  of  information. 


Insuring  the  integrity  of  a binary  string  as  it  is  moved 
from  the  memory  of  one  processor  to  another,  has  not  been 
easy.  Many  complex  issues  concerning  error  detection 
and  recovery,  message  touting,  system  response  and 
component  loading  have  been  faced,  only  to  uncover  the  next 
set  of  problems,  that  of  providing  adequate  semantics  for  the 
transferred  bit  str ings. .. Suppose  the  text  file  of  one  system 
requites  a carriage  return  and  line  feed  as  a line 
separator,  wiiile  another  system  requites  a carriage  return. 
Who  should  be  responsible  for  the  inclusion  or  exclusion  of 
the  line  feed?  Worse  yet,  wfiat  do  we  do  about 
Incompatible  integers,  character  sets,  and  floating  point 
data  types?  Current  solutions  are  worked  out  by  cooperative 
pr  og  t .unmet  s , not  processors,  and  severly  limit  solutions  to 
dynamic  reconfiguration  and  load  sharing  among  connected 
processors.  [<Gotdon>  p.  4] 


These  data  formatting  problems  have  been  essentially  avoided  in  some 
networks  by  inter-connecting  strictly  machines  that  use  similar  internal 
data  repre.sentat  ions  <Fteder  icksen , Mills,  Tlromasl,  Swam,  Wulf>. 
Processors  in  such  a homogeneous  computing  environment  requite  no  data 
format  translation  to  exchange  information.  They  are  assured  by  < mmon 
hardware  and  software  design  that  the  semantic  content  of  their  passed 
data  will  be  correctly  understood  by  their  intended  receiver  if  the 
bit  content  is  delivered  correctly. 


This  approach  to  distributed  computing,  although  attractive,  is  not 
sufficient  to  support  the  growing  di-mand  for  connected  ci'mputers. 


Motor  o^’eneous  Computr^r  Networks 

Clontly,  honuigeneous  m.ichlnos  provide  n processing  environment  mote 
hospitable  for  ' Inter-computer  message  transfer.  Unfortunately,  many 
t)tgan  1 zat  ions  have  discovered  too  late  the  advantages  of 
inter -connec t ion , having  already  acquired  machines  of  different 

manufacture  for  separate  computing  requirements.  Tire  capital 

investment  represented  by  those  computers,  in  both  hardw;ite  and 

software,  often  prohibits  their  replacement  with  more  compatible 

counterparts.  Ignoring  the  data  translation  problem  because  it  can  be 
avoided  in  homogeneous  environments  is  being  unresponsive  to  the  real 
needs  of  a large  segment  of  the  computing  community. 

Conveying  meaning  of  transmitted  bits  in  a heterogeneous 

environment  is  not  as  simple  as  it  may  first  appear.  The  difficulty 
arises  because  of  the  total  lack  of  an  industry  standard  for  the 
internal  representation  of  information  in  computers.  The  market  is 
filled  with  machines  of  every  description:  they  support 

sign-magnitude,  or  one's  or  two's  complement  arithmetic,  12,  lb,  24,  32, 
3b,  48,  or  bO  bit  word  lengths,  and  unique  floating  point  number 

representations.  At  the  softwrrre  level,  there  are  different  ways  to 
represent  complex  numbers,  vectors,  arrays  and  other  data  structures. 
Discrepancies  exist  even  in  the  case  of  character  data.  Although  the 
ASCII  character  set  has  become  an  Industry  standard,  different 

machines  still  ascribe  different  contextual  meanings  to  control 

characters  such  as  form  feed,  line  feed,  tab  and  carriage  return. 
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1 . 4 Summat  y 


There  are  several  methods  for  facilitating  general  Information 
exchange  between  two  dissimilar  processors.  Tills  report  examines 

these  methods  in  light  of  a set  of  design  considerations  for  computer 
networks.  There  are  problems  inherent  to  any  scheme  for  presenting  data 
in  different  formats  to  processors  with  different  requirements  and 
these,  too,  are  examined.  The  report  does  not  claim  to  solve  the  problem 
of  inhomogeneity.  Rather  its  intent  is  to  examine  the 

alternatives,  and  to  offer  an  adaptable  and  extensible  scheme  foi 
facilitating  inter  processor  communication  in  heterogeneous 

env  ironments . 


Chapter  I has  attempted  to  review  computer  networking  and  the 
relationship  between  host-to-host  message  passing  and  interprocess 
communication.  The  problem  of  moving  information  betwi-en 

heterogeneous  processors  is  introduced,  and  the  Intent  of  the  research 
stated  . 


Chapter  II  propose 
netwirk  supporting  funct 
data  translation  that 
heterogeneous  environment 
respect  to  the  stated  dcs 


s design  conslderatlo 
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Chapter  II  Is  the  subject  of  Chapter  III.  The  mechanics  of  the  scheme  are 
presented  and  alternative  designs  argued. 

Cliapter  IV  discusses  problems  that  are  inherent  to  any  data 
translation  mechanism,  and  suggests  practical  ways  to  deal  with  these 
problems.  Chapter  V introduces  implementation  considerations  for 
format  translators,  and  Chapter  VI  presents  some  areas  for  future 
study. 


I 
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Tlie  development  of  any  technique  acceptable  for  ptovidin^ 
commiin  icat  Ions  in  a heterogeneous  network  environment  must  be  guided  by 
tile  anticipated  operating  requirements  such  a facility  miiy  face.  Any 
sucli  scheme  must  be  flexible,  extensible,  provide  enough 
functionality  to  compensate  for  the  cost  and  overliead  It  incuts,  and  also 
be  easy  to  use.  From  its  beginning,  the  research  reported  herein  has  used 
a set  of  design  principles  as  a basis  for  the  evaluation  of  strategies  to 
provide  general  inter  processor  communications.  Tlie  following  section 
describes  those  design  principles. 


2.1  Design  principles 

Process  Addressing  — Almost  all  of  the  documented  computet  networks  in 
operation  today  support  node  to  node  message  transfer.  In  the  header 
of  the  message  being  sent,  the  transmitting  node  designates  a second  node 

on  the  network  as  the  target  for  that  message.  Implicit  in  this  method 

of  information  exchange  is  the  binding  of  each  commun icat ing  process 
to  a network  node. 

To  send  data,  a process  at  one  network  site  builds  a message  and 

addresses  it  to  the  network  node  that  represents  the  process  receiving  the 
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data.  Tlu'  sending  process  tiien  passes  the  messagi-  to  tlie  network  to  be 
delivered.  Wlien  it  is  accepted  at  its  destination,  tlie  message  is  passed 
to  a process  tunning  in  a host  at  that  site  for  wlilch  tire  information  in 
tile  message  was  intended.  It  ha.s  been  suggested  <Karbor  1 , Walden> 
that  rather  than  addressing  messages  to  a receiving  network  node,  the 
transmitting  process  address  messages  directly  to  the  receiving  process. 
The  subnetwork  interface  at  each  node  is  then  responsible  for  finding 
and  accepting  messages  addressed  to  any  processes  currently  active  at 
its  node. 


Process  addressing  h.as  several  inherent  advantages  compared  to  the 
more  standard  technique  of  node  or  processor  addressing. 

The  most  attractive  feature  of  this  approach  is  that  it 
allows  a uniform  conceptual  point  of  view.  Tlie 
processor  oriented  view  requires  a rather  continual 
translation  from  process  name  to  the  process  that 
supplied  the  service.  This  continual  translation  is 
required  for  reliability  and  flexibility.  [<Farberl>  p.  7.) 


The  added  flexibility  offered  by  process  addressing  is  a result  of 
having  the  physical  location  of  the  receiving  process  be  transparent  to 
the  sender.  This  transparency  facilitates  the  dynamic  relocation  of 

tunning  processes  on  a network  for  putp>oses  of  load  leveling  or  in  the 
event  of  partial  node  failure. 


Since  a message  is  not  directed  to  a particular  processor  it 
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rnarh  several  instances  of  a process,  each  running  under  tlie  same 

process  name  at  different  network  nodes.  By  allowing  the  duplication  of 

names,  the  communications  system  facilitates  broadcast  | 

announcements  to  several  or  all  network  nodes.  It  further  supports 

identical  processes  on  different  processors  for  increased  reliability 

I 

through  parallel  redundatrcy. 

Process  addressing  is  not  withoirt  a serious  technical  problem. 

Besides  depending  on  a network-wide  process  naming  scheme,  inherent  in  the 
concept  of  location  transparency  Is  the  requirement  that  every  node  be 
allowed  to  examine  every  message.  Eiich  must  compare  the  name  of  the 
process  being  addressed  with  the  names  of  the  processes  active  at  its 
location.  While  reasonable  for  some  network  topologies,  such  as  a simple 
bus  <Metcalfe2>,  ling  <Farberl>,  or  star,  it  is  out  of  the  question  for 
others.  Ttie  advantage  of  a tree  network  <MRC>,  for  example,  is  its 
ability  to  favor  communications  paths  between  certain  processors.  Hiis 
advant.age  Is  meaningless  when  every  message  must  be  circulated  to  every 

node.  In  the  case  of  multiply  connected  store  and  forward  packet  switching 

networks  <ARPA,  Frank,  Heart,  Metcalfel,  Pouzin>,  a mechanism  would  be  . 

necessary  to  insure  that  every  packet  travelled  through  every  node. 

Howt'ver  , because  some  network  structures  ate  suited  to  it, 

sending  messages  by  process  name  is  a Irrgltlmate  operating  feature  for  a 

mechanism  that  supports  inter  processor  commitn Icat ions . Facilitating 
^ ♦ 
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process  addressing  is,  therefore,  a design  consideration  for  such  a 
mechanism . 

Expansion  — Short  of  special  purpose  networks  designed  witii 
particular  components  and  applications  ' .1  mind,  extensibility  is  an 
important  consideration  in  network,  facility  design.  In  the  case  of  a 

general  interprocessot  communications  scheme,  the  issue  taki's  two 
forms . 

First  is  the  expansion  of  the  network  itself.  The  ability  to  add 
nodes  to  a network  with  a minimum  of  disruption  to  the  operation  of 
already  existing  network  nodes  has  been  a design  consideration  for  and  been 
achieved  by  ra.any  networking  efforts  <Bindet  , Farberl,  Frank,  Fraser  1, 
Mills,  Metcalfe2,  Pouzinl>.  It  is  equally  essential  that  incremental 
expansion  of  a network  should  cause  minimal  disruption  to  a mechanism  that 
provides  data  t ep  .'esentat  ion  compatibility  between  processors. 

The  second  concern  is  for  the  addition  or  modification  of  data 
formats  in  use  by  the  processors  at  nodes  already  part  of  a network.  The 
design  of  a network-wide  communications  scheme  must  anticipate  the  need  for 
such  changes  and  provide  the  means  for  handling  them  with  a minimum  of 
e f for  t . 
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Functional  Sophistication  — General  Inter pt ocr'ss  Information  exchange 
requires  the  support  of  a communications  facility  that  allows  for  the 
trans-network  movement  of  a wide  range  of  data  types.  Tlie  size  of  the 
subset  of  data  types  handled  by  a mechanism  and  the  flexibility  with 
wiiich  the  supported  data  types  can  be  manipulated  ate  a measure  of  that 
mechanism's  sophistication. 

Strategies  for  relatively  simple  Infoi-mation  exchange  between 

processors  in  a heterogeneous  environment  have  alre.ady  appeared.  An 
example  of  such  a strategy  that  handles  a single  data  type  is  the 
ARPANET  TELNET  <ARP.A2>  described  in  a later  section.  TELNET  provides  a 
protocol  for  sending  text  across  the  ARPA  network  between  host 

processors.  F.ach  processor  may  store  text  in  any  of  the  several 

commercially  used  internal  representations  tor  characters. 

The  single  data  type  provided  by  TELNEf  does  not  offer  the 

sophistication  required  to  support  general  interprocess 

communications.  Although  characters  are  rite  most  often  considered  data 
type,  they  .irt*  only  a very  small  subset  of  types  used  to  transfer 

information  between  processes.  Textual  liiformatlon  is  more  easily 
handled  because  of  the  Industry-wide  recognition  of  the  USASCII 

character  set  <Bhu3han">.  Howt'ver  , providing  data  transfer  in  a 
heterogeneous  processing  environment  tr-qiiltes  provisions  for  handling  data 
types  without  a standard  format  as  well. 
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App 1 ica  t ion- 1 eve  1 Tt  ansparcncy  — Removing  the  programmer  from  tire 
details  of  machine  operation  has  become  a generally  accepted  notion 
among  members  of  the  computet  field.  An  example  of  this  kind  of 
thinking  is  found  in  <Corbato>.  In  line  with  this  philosophy  is  the 
logical  separation  of  the  internal  and  external  data  formats  used  in 
ordinary  data  processing.  In  this  sense,  interna!  data  formats  ate 
hardware  dependent  and  external  formats  are  those  conceptual  items  with 
wirich  the  applications  programmer  and  the  human  end-user  must  deal. 


Tlie  importance  of  this  distinction  was  noted  as  early  as  1968  by  the 
National  Bureau  of  Standards. 


...the  internal  representation  of  data  is  concerned  with 
the  manner  in  wtiich  particular  computers  and  other  hardware 
store  and  move  the  data  around  inside  the  system.  This  is 
not  the  user's  province,  and  there  should  be  no  imposition 
on  him  as  to  how  it  is  done.  For  example,  it  should  be  of 
no  concern  to  him  wliether  the  hardware  represents  his 
data  by  means  of  6-bit,  8-bit,  or  6A-blt  units  within  the 
computer.  He  should  have  no  concern  with  "packing" 

and  "unpacking"  of  characters  within  the  computer  words.  He 
should  not  be  troubled  with  physical  file  units.  These  are 
all  aspects  of  the  supporting  technology...  [<Little>  p. 
93] 


Applications  of  this  view  of  data  maintenance  to  the  problems  of 
general  process  communications  In  a network  demands  that  the  necessary  data 
reformatting  be  transparent  to  the  a[>[)  I Icat  ions  programmer.  He  should 
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li.ivi’  ,1  ronsistnnt  v U'w  of  (Into  itoms  t il  1 i-ss  of  t iu' i t .irtunl 
r (’pt  fscn  t .1 1 i on  ot  wtinri’  in  .)  ni'twork  they  mav  o r I p,  i n.i  1 1' . 

Min  iin.i  1 Host  Ove  t head  — I’tovidlnp  an  i n to  r pr  oot-ss  data  rominun  icat  ions 
sotvico  amonp  hi'totogcncoiis  ptocossots  roquitos  some  loved  of  data 
r c‘f  o tma  1 1 i np  . In  larpo  part,  the  amount  of  overlioad  toqiiirod  to 
pot  form  data  translation  dopends  on  the  stratopy  used  to  arhiovo 
format  compatibility.  A general  schorao  for  facilitating  such 
communications  and  an  associated  implementation,  howi’ver  , stiould  not 
presume  on  th<“  sophistication  of  processors  coniu-cted  to  tiie  ni'twnrks  as 
hosts . 

Tills  concern  is  slightly  apart  from  the  development  of  a data 
communications  scheme,  as  it  is  more  an  issue  of  a scheme’' s 
implementation.  'Hie  distribution  of  ne’twork  related  functions,  such  as 
message  formatting,  between  the  subnetwork  communications  components  and 
their  associated  hosts  is  not  a settli’d  question  for  networking  in  general. 
Tlte  issues  are  discussed  in  Section  5.2.  Nonetheless,  in  order  to  he 
responsive  to  the  needs  of  those  network  environments  that  include 
hosts  with  limited  processing  power,  a mechanism  for  providing 
in  t e r pr  treesso  r communications  must  be  designed  with  the  demands  it  places 
on  host  processors  in  mind. 
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Minimnl  Messa^f  Over  he- nd  — Tho  liraitiri)’  of  ovethond  nddc>d  to  tlio 

tu'twotk  hosts  is  requited  to  siippott  the  use  of  deviees  with  limited  ot 
inflexible  processing  enpabilities  as  hosts.  Minimizing  message 

overhead  is  a consideration  that  addresses  tile  total  cost  of  message 
processing.  Ttic-se  two  design  goals  combine  to  minimize  the  overtiead 
caused  by  message  passing,  and  then  force  as  mucti  of  t tie  remaining  overtiead 
as  possible  into  ttie  node.  tliat  remains  into  ttie  network  liosts. 


Tills  total  overhead  includes  ttie  processing  requited  at  each 

network  site  (host,  subnetwork  node,  or  specl.il  network  processing 

modules)  and  traffic  on  the  commnn  lent  ions  links.  Overtiead  is 
represented  at  each  network  site  by  the  maintenance  of  a 

so f twa r e/ ha r d war e base  plus  the  processing  time  to  perform  the  message 
reformatting.  At  the  communications  level,  message  overhead  appears  in 
ttie  form  of  header  and  information-describing  bits  ttiat  increasi'  the 
length  of  data  messages  being  carried.  Reducing  tlie  overhead  at  ttiese 


levels  increases  the  effective  tliruput  of  eacti  mess.ige 

transmission  and  reduces  tlie  message  processing  required  a^  c v’h 
network  node. 


Kel tab  11 Ity  — The  reliability  of  a network  based  system  is  .i  function  of 
seveial  different  aspects  ol  design  and  impl  ement.it  ion . Hie 
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I' a t ogot  i os  ()  t atoas  that  must  be  considered  include:  failure  of  ;r 

commun  ic at i ons  link,  failure  of  the  node  hardware  supporting  message 
handling  functions,  surfacing  of  bar d wa r e/ so f t wa r e design  errors,  and 
system  f a ul I- to  1 er ance  for  each  of  these  categories.  These  issues  are 

involved,  however,  not  all  of  them  are  relevant  to  a discussion  of 

tacilitating  meaningful  host-to-host  information  exchangi’.  Thus,  a 
simplified  view  of  reliability  is  adopted. 

Tire  interprocess  communication  facility  considered  hc>re  is  a 
functional  improvement  to  a rudimentary  bit-passing  communications 

scheme.  However,  increasing  the  functionality  of  a network 
communications  system  involves  increasing  the  number  and/or  complexity  of 
required  system  modules;  both  the  size  and  the  sophistication  of  a 
mechanism  are  directly  related  to  failure  through  error  in  design.  It 
follows,  therefore,  that  an  important  consideration  in  the  design  of  a 
strategy  to  increase  system  function  is  to  limit  the  number  at\d 
complexity  of  additionally  required  hardware  and  software  modules.  It  is 
with  this  limited  view  of  reliability  in  mind  that  the  following 
strategies  ate  evaluated. 

2.2  Possible  strategies 

Several  schr’mes  for  facilitating  commun  Icat  lotrs  in  a 

he  t e r ogemurus  environment  are  presenti-d  In  tl\ls  section.  Tlii'  common 
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bnsis  foi  c.u'i)  f those  strategies  is  tlieit  need  for  a format 
translation  t r om  tire  internal  data  format  used  by  the  transmitter  t ii  that 

used  by  the  Intimded  teceiviT. 


It  is  well  recognized  tlint  hosts  in  a heterogeneous 
ncturuk  use  different  hit  patterns  for  encoding  information. 
Data  translation  is  the  basic  capability  wiiich  permits 
hosts  to  communicate  with  each  other  in  spite  of 

their  differences.  It  follows  that  a data  translation 
capability  is  central  to  any  effective  capability  to 

communicate  among  heterogeneous  computers.  <KimbletonI>  p. 
555  .] 


The  differences  in  the  schemes  discussed  below  lie  in  the  steps  that  (>ach 
requires  to  perform  that  translation. 


2.2. I User  translation 


The  simplest,  and  so  tlie  most  often  adopted,  attitude  towards 
providing  formatting  for  data  transmitted  over  a network  attempts  to 
avoid  the  issue  completely.  In  some  cases,  networks  consist  of  a 


totally  homogeneous 

collect  ion 

o f 

pt  ocosso  t 

s and 

so  f t wa  r e 

«>nv  ironments , and  .so  never  require  any 

data 

translation  <Haverty,  'i  i I I s , 

Swan,  Wulf>.  However, 

the  ma j or  1 1 y 

of 

currently 

o per  at i ng 

net  wc' t ks 

that  have  adopted 

this  approach 

do 

not  fal  I 

into  1 1)  i s 

cat  egor  y . 

Networks  such  as  ARPANF.T,  DCS,  CYClJiDKS,  and  F.THF.RNKT  ate  designed  to 
support  heterogeneous  pttrcessot  environments,  yet  leave  the  data 
transl  itlon  necr'Hsaiy  to  facilltati-  geneial  inter  proeessot  communications 
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tot.illy  to  tlu'  .ipp  1 ic,u  ions  pi  or  t .imini'i  s . lARPANKI'  ■•iocs  piovidi-  somo  toim.it 
ti.inslation  for  sporitir  typos  of  piocoss  to  ptocoss  comniiinl  it  ion. 
llioso  will  bo  ciiscu.ssed  l.itoi.) 

For  notwoik  n.sor  commiin  i t ios  for  wtiirh  t bo  rootd  in.itoil  nso  of  moto 


than  one  network  processor 

is  an 

infrequent  i equi  i i>ment  , 

hand  1 i ng 

ti  a t n 

format 

i ncompa  t ib 1 1 1 1 les 

at  the 

applications  level  on 

a speci.il 

case 

basis 

may  be  suf f ic lent 

. Th  i s 

seems  inappropriate. 

however  , 

f or 

hotor  oRoneoiis  networking  efforts  investigating  tho  issiic*s  relevant  to 
distributed  data  bases  and  distributed  operating  systems.  It  is  these 
functions  especially  that  require  a high  dogioe  of  inter  processor 

interprocess  information  exchange. 

DCS  is  an  example  of  such  a network  project.  1110  instillation  of  a 
distributed  operating  system  on  a fully  lieterogeneous  DCS-type 

network  has  already  been  discussed  <Rowe> . The  DCS  project  head 

agrees  that  a data  reformatting  mechanism  would  he  an  important 
addition  to  his  research  efforts,  however,  the  problems  of  genor.il 
format  translation  ate  too  complicated  to  be  addressed  by  his 

researchers  at  this  time  <Fatbet2>. 
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1 2.2.2  Rcceivet  translation 

One  approach  to  actually  facilitating  dat.a  cominiin  icat  ions  is  to 
provide  data  translators  at  each  node  eligible  to  receive  mess, ages  f r itm 
I the  netiork.  In  such  a scheme,  the  transmitter  performs  no  data 

reformatting.  To  send  a message,  a process  only  forms  a data  block  to  be 
transferred  using  tht'  internal  format  native  to  the  processor  on  wiiich 

t 

it  is  running.  The  data  traverses  the  network  in  its  original  format, 

but  cattles  with  it,  in  some  network-wide  format,  a description 

of  the  processor  at  wiilch  the  message  originated.  The  transmitting 
^ process  can  always  know  the  nature  of  its  supporting  host  and  insert  this 

information  into  the  message  being  sent. 

When  a message  is  accepted  at  a network  node,  the  receiver  reads  the 
message  field  that  identifies  the  transmitting  processor  type.  It  then 
performs  any  conversion  necessary  to  translate  tire  sender's  internal 
format  into  the  internal  format  appropriate  to  the  receiving  host.  Tlie 
identity  of  the  transmitter  needs  to  (and  can)  be  known  to  the  receiver  , 

' wtille  the  receiving  node  remains  unknown  to  the  transmitter.  Tills 

condition  supports  the  node  Independent  (or  process)  addressing 
previously  discussed. 

A major  disadvantage  of  such  a sctieme  is  that  for  each  processor  to 
be  able  to  interpret  messages  from  every  other.  It  must  liave  access  to  a 
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translator  tliat  can  resolve  each  possible  dissimilar  processor  f)airing. 
At  every  node  there  must  be  a translator  to  convert  eacli  of  the  internal 
formats  usi'd  on  the  network  to  the  internal  format  iis('d  at  that  node.  In 
the  case  of  ' n'  different  types  of  processor,  this  method  of  operation 
requites  that  the  network  support  'n(n-l)'  translators  t(t  be 
completely  general,  since  each  processor  must  be  able  to  communicate 
with  all  of  the  other  types  of  processor  on  the  network.  Kor  diverse 
environments,  this  number  quickly  becomes  prohibitive. 

This  technique  also  hampers  incremental  system  expansion.  In 

order  for  a new  type  of  processor  to  communicate  in  the  system 
environment,  it  must  be  supported  by  a translator  that  translates  from 
every  existing  format  into  the  format  of  the  node  being  added. 
Conversely,  a translator  that  translates  the  new  machine's  format  must  be 
added  to  every  node  already  in  the  system.  To  continue  to  support  ev'ty 
possible  communication  path  in  the  environment,  every  host  requites 
some  modification  when  the  'n+l'th  host  is  introduced  into  the  system. 
'2n'  translators  must  be  developed  — 'n'  to  reside  at  the  new  node  to 
allow  its  neighbors  communicate  with  it,  and  one  now  translator  for  each 
host  already  in  the  environment  to  allow  them  to  receive  comraun ic at  ions 
from  the  added  host. 

A sllgnt  variant  of  this  scheme  is  to  have  each  ttan.smitter 

perform  all  of  the  data  formatting  for  its  Intended  receiver. 
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However,  this  offers  no  relief  to  tiie  need  for  a large  nunber  of 
translators.  Further,  it  interferes  with  pr ocess-tn-process  message 
transfer  and  with  network-wide  message  broadcasting  by  forcing  the 

sender  of  a message  to  anticipate  the  internal  data  format 

r eqni t eraent s of  its  receiver. 


2.2.3  Intermedi.ite  translator 

A topologically  different  mechanism  places  a third  party  between  two 
communicating  processes  solely  to  petfonn  any  needed  data 
conversions.  An  experimental  project  on  the  ARPA  network  provides 
access  to  such  an  intermediate  translator  for  specific  applications.  The 
project  is  the  data  reconfiguration  service  (DRS)  <ARPAS>. 

The  DRS  offers  a solution  to  the  problem  of  data  format 
incompatibility  between  a particular  applications  program  and  its 
intended  users.  Through  a predefined  translation  mapping,  tire  DRS  .acts 
as  an  interpreter  between  the  program  and  its  user,  permitting  each  to 
communicate  in  its  own  format. 

Ttiere  ate  two  stages  to  the  use  of  DRS.  First,  the  applications 
programmer  must  describe  <a  mapping  between  the  data  formats  native  to  the 
processor  hosting  his  program  .and  tire  formats  native  to  the 
processors  representing  his  ptogr.am's  users.  Tills  requires  specific 
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knowledge  of  both  of  the  formats  Involved  and  the  I/O  data 
requirements  of  the  application.  Orrce  such  a mapping  is  fully 
defined,  the  programmer  prepares  a description  of  the  conversion  in  .r  DRS 
supported  language  and  catalogs  that  description  by  .a  unique  name-  witli  the 
DR.S . 


Wlien 

a 

user  process 

wishes  to 

common icate 

wi  th  such 

3n 

appl icat ions 

prog  ram , 

it 

makes  a 

connec  t ion 

with  the  DRS. 

-- 

requests , 

by 

name,  the 

UvSe 

of  the 

appr  opr  late 

format  translation 

description  prepared  by  the  applications  programmer.  The  DR.S  then 
makes  a connection  to  the  desired  program  and  from  then  on  tire  program  and 
its  user  communicate  tlirough  the  DRS  — each  data  transfer  being 
reformatted  according  to  the  specified  reconfiguration  scheme. 


The  result  is  that  both  the  applications  progr.am  and  its  users  onlv 
handle  messages  In  their  own  respective  formats. 


The  user  process  behaves  as  if  it  were  connected 
directly  to  the  server  process,  and  vice  versa.  The  DRS 
appears  transparent  to  both  processes;  its  function  is  to 
reconfigure  data  that  pass  in  each  direction  between 

them  into  formats  amenable  to  each  of  tlielr  processing 
requirements.  [<Anderson>  p.  3.1 


The  DRS  is  effectively  transparent  at  the  application  level  and  yet 
extensive  data  translation  may  be  taking  place. 
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As  Impl rmrntod  on  ARPANFT,  an  Intermod late  third  party  for  data 
I oformat  t ing  has  only  limited  use.  F^ach  application  requiring  the 
service  must  catalog  the  appropriate  format  descriptors  at  the 


r econf Igur  at  ion 

service  site. 

Each 

DRS 

mapp i ng 

desc  r i pt ion 

i s 

specific  to  an 

application,  as 

we  11  as 

to 

the  formats  of 

the 

associated  processors.  Tiiese 

descr iptors 

prov ide 

a syntactic 

structure  which 

can  be  applied  to 

incoming 

bit 

str  ings 

to  del imit 

t h e 

separate  data  items  for  reformatting.  Such  a description  is  essential  to 
the  reformatting  process. 


2.2.4  Standard  format 

The  most  often  cited  network  communications  facility  uses  a 

standard  intermediate  format  to  exchange  data  between  potentially 
dissimilar  hosts.  This  facility  is  TF.LNF.T  <ARPAT>.  Running  on 
ARPANET,  TELNET  and  its  companion  protocol  for  file  transfer,  FTP 

(file  transfer  protocol)  <ARPA4> , support  the  transfer  of  characters  from 
one  netwrrk  host  to  another.  These  protocols  are  discussed  below. 

The  TELNET  protocol  is  intended  to  carry  characters  between  a 

process  representing  a human  user  at  a data  terminal  or  a process 
expecting  to  communicate  with  a terminal.  TEI.NET  forces 

standardization  of  character  formats  by  interposing  tire  notion  of  a 
netwrrrk  virtual  terminal  (NVT)  betwr-en  the  two  lurmmun  icat  Ing 
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processes.  Kach  iiost  maintains  a rr-sident  translator  that  pertorms  the 
cl.ita  r e fo  t raa  1 1 i np,  between  the  intr-rnal  representation  of  chiractet  <iata 
used  by  the  host  and  tliat  of  tli(‘  NVT.  Any  lios  t - 1 es  i d r-n  t process  that 
either  enuilates  or  services  a remote  terminal  must  comm\inlcate  with  lire 
network  throuph  an  instance  of  such  a translator. 

Tile  standard  character  format  used  by  TF.l.NKT  Is  seven-bit 

I'SA.SCII.  The  data  representation  and  conventions  adopted  for  NVT,  as 
described  in  the  TE1.NET  specifications,  were 

intended  to  strike  a balance  between  being  overly 
restricted  (not  providing  hosts  a rich  enough  vocabulary  for 
mapping  Into  their  local  character  sets),  and  being  overly 
inclusive  (penalizing  users  with  modest  terminals). 
[<ARPA2>  p.  1.] 

This  Is  the  original  TELNET  protocol.  However,  a scheme  for 
providing  extensions  to  the  NVT  through  the  "principle  of  negotlrted 
options"  has  been  added.  The  principle  of  negotiated  options  allows  twr’ 
communicating  processes  to  discuss  and  agree  to  the  use  of  each 
available  extension  to  the  standard  NVT  format.  Since  not  all  options  will 
be  supported  at  all  sites,  the  ability  to  decline  as  well  as  request 
and  accept  the  use  of  options  Is  provided.  By  using  tire  hand-shaking 
protocol,  two  processes  can  find  the  maximal  set  of  options  that  is 
appropriate  for  their  use. 


Tlie  options  available  ate  all  extensions  and  enhancements  of  tlie  NVT. 
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Ttioy  include  changing  the  disposition  of  control  characters  (carriage 
return,  line  feed,  form  feed,  tab),  extending  the  character  set  and 
altering  the  irressage  forirrat.  As  described  in  the  TK1.NET  option 
specifications,  these  extensions  ate  provided 


to  pcrirrit  sites  to  obtain  more  elegant  solutions  to  the 
problems  of  communication  between  dissimilar  devices  than  is 
possible  within  the  framework  provided  by  the  Network 
Virtual  Terminal.  [<ARPA3>  p.  1.] 


It  is  through  the  mechanism  of  negotiation  that  use  of  these  options  is 
controlled  by  the  communicating  hosts. 


The  file  transfer  protocol  was  designed  to  provide  a mechanism  for 

file  movement  across  ARPANET.  As  with  TELNET,  the  communicating  FTP 
processes  agree  through  negotiation  on  the  data  format  for  the 

information  transfer.  F-acli  host  performs  the  translation  necessary  tir 
convert  its  internal  representation  into  and  out  of  the 

intermediate  format  being  used  for  the  data  exchange. 

The  need  for  data  reformatting  in  the  hosts  is  discussed  in  the 

original  specifications  for  FTP.  Wliile  crossing  the  network,  a text  file 
can  be  represented  in  the  character  set  used  by  the  TELNET  NVT. 


Data  is  transferred  from  a storage  device  in  the  sending 
host  to  a storage  device  in  the  receiving  host.  Often 
ft  is  necessary  to  perform  certain  tr  ansftrrmat  ions  on 

the  data  because  data  stor.ige  representations  In  the 
twr>  systems  are  different.  For  example,  NVT-A.SCII  has 
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different  data  storage  representations  In  different 
systems.  PDP-lO's  generally  store  NVT-ASCI!  as  five 
7-bit  ASCII  characters  left  justifl.-d  in  a db-blt  word. 
360's  store  NVT-ASCIl  as  8-bit  KBCDIC  codes.  MUl.TICS  stores 
NVT-ASCII  as  four  9-bit  characters  in  a Ih-bit  word.  It  mav 
be  desirable  to  convert  characters  Into  tire  standard 
MVT-ASCII  when  transmitting  text  between  dissimilar  systems. 
The  sending  and  receiving  sites  would  have  to  perform  the 
necessary  t r ansfonnat  ions  between  tin-  standard 

representations  and  tlrcir  external  representations. 
[<ARPAA>  p.  9.] 


For  text  files,  two  standard  character  representations  (NVT-ASCII  and 
F.BCDIC)  ate  supported  hy  FTP.  Options  for  specifying  format  control 
information  ate  also  available.  The  human  FTP  user  sets  up  the 
appropriate  options  and  then  initiates  the  file  transfer. 


Non- text  files  may  also  be  moved  by  FTP.  Tliose  are 
unexamined  blocks  of  bytes  of  a specified  length, 
uninterpreted  binary  data  can  cause  a problem  in 
between  host  systems  with  different  internal  word  lengths. 


transferred  as 
However  , even 
t epr  esen  t a t ion 


It  is  not  always  cleat  how  the  sender  should  send  data, 
and  the  receiver  should  store  It.  For  example,  wiien 
transmitting  32-bIt  bytes  form  a 32-bIt  word-length  system  to 
a 3b-bit  word-length  system,  it  may  be  desirable,  (for 
reasons  of  efficiency  and  usefulness)  to  store  the  32-blt 
bytes  r ight- just  if  led  in  a 3b-blt  word  in  the  latter 
system.  In  any  case,  the  user  should  have  the  option  of 
specifying  data  representation  .and  transformation  functions. 
It  should  be  noted  that  FTP  provides  for  very  limited 
data  type  representations.  Transformations  desired 

beyond  this  limited  capability  should  be  performed  by 
the  user  directly  or  via  the  use  of  the  data 
reconfiguration  service,  (<ARPA4>  pp . 9-10.1 
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/\n  opt  ion 
log  ion  1 byte  s 
ron  f(ttco  the 


is  ptovid»“d  to  .illow  the  human  user  to  sper  ( f v 
i ze  of  the  data  being  sent.  Tlirotigh  this  mechanism  the 
receiver  to  block  and  pad  tire  data  for  storage. 


t lie 
use  r 


Tills. ..is  intended  for  the  transfer  of  structured  data.  For 
example,  a user  sending  36-blt  floating  point  niunbets  to  a 
host  with  a 32-bit  word  could  send  his  data... with  a 
logical  byte  size  of  3b.  The  receiving  host  would  then 
be  expected  to  store  the  logical  bytes  so  that  they  could 
be  easily  manipulated;  in  this  example  putting  the  lb-bit 
logical  bytes  into  b4-bit  double  words  should 
suffice.  [<ARPA4>  p.  13] 


It  is  only  through  this  option  that  any  information  on  the  intended 
format  or  use  of  binary  data  can  be  passed  along  with  the  bits  in 
non-text  files.  The  problem  of  non-character  data  types  is  only 
considered  in  this  way  by  FTP. 


Another  ARPANKT  project  that  has  had  to  deal  with  the  problems  of 
i nter pr ocessor  communication  in  a heterogeneous  networking  environment  is 
the  Nation.rl  Software  Works  (NSW).  The  NSW  project  recognizes  tlie 
existence  of  large  software  systems  that  can  serve  as  "t<rols"  for 
turther  software  development.  Presently  these  softwa'o  systems  are 
scattered  across  the  ARPANET. 


...the  National  Software  Works  will  provide  users  with 
access  to  software  development  tools  on  wtilchever  machine 
the  tools  happen  to  be.  User's  files  ate  moved  to  the 
tools  over  the  NET,  s<r  tlu*  tools  do  not  have  to  be 
reprogrammed  for  each  new  coni|)uter  . People  building  tools 
may  select  the  machine  wliich  Is  best  suited  for  the  tool 
they  ate  building...  (<(;torket>  p.  5.) 
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Two  ph.isi'S  ot  NSW  .ire  spm-iflod  to  .iddtt'ss  th('  d.it.n  I r ,in  s 1 .n  t i on 
issues.  T>i('si'  ate  inter  pt  nces.s  communication  and  the  file  transfer 
s ys  t c'ln  . 


Int  er  pr  oc«-ss  communication  under  NSW  was  origin. illy  to  be 

supported  by  a system  called  the  Ptoceduro  Call  Protocol  (PCP)  wli  Lch  was 
later  augmented  and  renamed  the  Oistributed  Processing  System  (DPS). 
DPS  was  designed  to  support  information  tr.insfcr  between  dissimilar 
network  hosts.  Tlic  protocol  was  to  include  data  communication 
through  the  use  of  standard  intermediate  representations. 

<Kirableton3,  WliiteJ>  The  scheme  w.is  to  iiandle  most  fundamental  d.ita 

t ypes . 

Until  mid-August  1475,  NSW  planned  to  provide  for 

communication  between  most  of  its  building  blocks  through  the 
Distributed  Processing  System.  In  August,  DPS  was  formally 
dropped  from  the  NSW  plan  in  favor  of  a much  less 

complicated  scheme  called  MSG.  [<Kimbl  eton'3>  p.  1-58.) 

In  .January  1976  , the  preliminary  spec  1 f Icat  ions  for  MSG  were 

released.  The  report  <MSG>  deals  with  the  data  reformatting  issue  in  .i  w.ry 
different  from  that  of  DPS. 


Message  exchange ...  1 s expec  ted  to  be  the  most  common 
mode  of  communication  among  NSW  processes.  To  send  a 
message,  a process  addresses  it  by  specifying  the  address 
of  the  process  to  receive  the  message  and  then  executes  .an 
MSG  "send"  primitive  wtilch  requests  MSG  to  deliver  the 
message.  [<MSG>  p.  1-8.] 


Fnr  i 1 i t.it  I nf’  Inter  processor  Coramnn  lent  ions 


A nii-ss.ige  is  .1  str  itift  of  bits  crtMtcd  in  tlic  loc.il 

TU'mor  y of  ;i  scn(iinj>  ptoci-ss.  MS(I  sends  the  mess;i('e  to  .1 
receivinj;  process  by  d up  1 icn  t 1 n>;  the  bit  strinp,  in  .1 
specified  portion  of  tii(‘  teceiviny,  process's  loc.rl 
memory.  MSil  itself  imposes  tro  fiitllu't  structure  on 

mr'ssnyes,  not  dot's  it  interpret  tin'  contr'iits  of  mess.iyes . 
[eMS('.>  p.  2-f>.] 


IM.'ins  for  dntn  format  standardization  to  support  interprocess 
communication  were  dropped  in  MSG. 


Tire  file  transfer  system  was  designed  trj  perfrrrm  file  format 

translations  on  data  files  as  they  were  moved  by  NSW  across  ARHANKT. 
Hiis  facility,  too,  has  been  reconsidered. 


The  file  transfer  system  is  heavily  dependent  on  Dl’S. 
Since  DPS  has  been  discontinued,  the  initial  NSW 
implementation  is  going  to  use  FTP  to  move  files.  l..ater 
refinements  may  provide  fot  the  non-FTP  supported 
features  of  the  file  system.  [ <K  imbl  oton 'l>  p.  a-68.] 


In  summary,  then,  although  the  original  NSW  design  included  an 
examination  of  the  data  formatting  Issues,  the  crrrrent  project  effort  has, 
at  least  fot  now,  laid  those  issues  aside. 


TKl.NKT  and  FTP  offer  two  examples  of  the  use  of  a standarl  data 
format  to  support  information  transfer  hetwr-en  dissimilar  processors  on  a 
computet  network.  Negotiated  options  .ire  an  extension  of  the 
mechanism  that  allows  flexibility  In  the  selection  of  the  intermediate 
representations  two  proces.sots  will  use  In  .1  given  excliange. 
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Both  TKLNFFF  and  FTP  force  (Mch  transmission  between  ptocessnrs  1 1> 
conform  to  a universally  observed  intermediate  data  format  (IDF).  To 
transfer  data,  each  processor  reformats  its  information  into  tbe  IDF,  and 
then  sends  the  translation.  Upon  receipt  of  a message,  a 
processor  must  perform  a format  translation  to  change  tbe  IDF 

representation  into  its  own.  This  mechanism  provides  processor 

independent  interprocess  communication,  since  the  broadcast  message  data 
format  is  tbe  same  for  every  system  host.  The  number  of  required 

translators  is  reduced,  as  well.  Two  translators  for  each  type  of 
processor  ate  required  — one  for  translation  into  and  one  for 
translation  out  of  the  IDF.  Again  letting  'n'  be  tbe  number  of 

dissimilar  machines,  tlie  number  of  translators  needed  here  is  only 
'2n'.  Tliat  Is,  each  processor  in  the  environment  must  support  exactiv  tw 
t r ansi ator  s . 

A universal  frrrmat  also  facilitates  incremental  system  expansion. 
Since  each  new  processor  need  only  be  able  to  understand  tbe 

relationship  between  the  IDF  and  its  own  format,  its  addition  to  tbe 

system  does  not  require  knowledge  of  tbe  current  corrf  igiit  at  loir . 
Furt'tu't,  the  processors  .already  In  tbe  system  will  only  communicate  with 
th«'  new  entry  in  a format  they  already  know.  No  modifications  or  additions 

expansion  Is  solely  on  tbe 
whet  e It  be  1 ongs . 

I 


to  them  .tre  requited.  Tlie  butili-n  for  system 
processor  being  added,  which  is  precisely 
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-.1  Summary 


Tio  previous  sertions  tiave  presented  design  alternatives  for  a 
mi'chanism  to  support  inter pr ocessor  communications  in  a heterogeneous 
computer  netwrnk.  Tlie  selection  of  a design  for  implementation  should  be  a 
direct  result  of  measuring  the  proposed  mechanisms  against  desired 
design  characteristics.  The  design  characteristics  being  ('onsideted 
ate  the  following: 

— process  addressing 

— easy  expansion 

— functional  sophistication 

— app 1 icat ion- 1 evel  transparency 

— minimal  host  overhead 

--  minimal  message  overhead 

— reliability 


Ttiese  ate  applied  to  the  three  proposed  mechanisms  described  in  the 
preceding  section: 


— receiver  translation 

— intermediate  translator 

J — standard  intermediate  format 


I Process  Addressing  - The  Internal  data  formats  used  by  .i  tunning 

1 


pi  .'COSS 

depend  on 

the  format 

<‘mployed  by  the  processor 

on  wlilch 

that 

pr  ocess 

1 s 

ac  t i ve . 

Only  by  delaying  tlie  binding  of  a 

target 

d a t a 

f 

1 

t n I m ii  t 

to 

each  message  until 

that  message  is  accepted 

at  the  nodi 

on 

Ui  Ich 

t he 

des i r ed 

process  is 

.ictivi*  can  process 

add  t ess  1 ng 

bo 

tac  1 1 Itated  . 
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Botli  1 1 ansi  at  ion  by  toceivot  and  tin*  use  of  an  intermediate 

format  delay  the  final  stage  of  data  reformatting  until  a message 
teaches  its  destination.  These  mechanisms  do  not  interfere  with 

addressing  messages  to  processes.  An  intermediate  translator,  on  the 
other  hand,  requires  the  specification  of  the  target  format,  and 

thi'tefore  the  receiving  processor  has  to  bi'  identified  before  a 

message  can  be  translated  and  retransmitted  to  its  intendi'd 
tiest  inat  ion . A scheme  based  on  such  a translator,  then,  '-annot 

support  process  addressing,  while  the  other  two  strategies  can. 


Easy 

Expansion 

Expansion  includes  both 

the 

add  i t ion 

of 

processor 

types  to 

a network  and  the  extension  of 

the 

formats  used 

by 

processors  already  supported.  An  intermediate  translator  ac t s as  a 
central  agency  for  all  data  reformatting.  Under  such  a selu'me,  any 
revisions  required  for  system  expansion  ate  localized  at  that 
translator,  and  so  the  mod  i f ic.it  ions  can  be  made  easily.  Similarlv, 
because  an  Intermediate  format  demands  that  communications  only  .ippoar  in 
the  network  standard,  expansion  of  a system  b.ised  on  that  mech.inism  impacts 
only  the  translator  at  the  site  being  added  or  changed. 

However,  as  described.  In  a network  with  'n'  processor  types,  the 
modifications  required  for  the  receiver  translation  mechanism  increase  as 
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'n  . ('n-1'  ;ind  ' 2n'  1 1 .ins  1 at  o r s .arc  .affected  by  format  extensions  .rnd 

■idditions  I (' spec  I i ve  1 y . > Compared  with  tbe  two  otliei  schemes,  ti-ceivi-r 
translation  carries  .a  high  cost  of  expansion. 


Functional  .Sophistication  - F.ach  strategy  is  logically  sufficient  to 
support  a full  range  of  data  reformatting  facilities. 

Appl  Icat ion- level  Transparency  - In  large  part,  the  impact  felt  by 
end-users  of  any  network  mechanism  depends  on  tlie  host  or  nr'twork 

opr-rating  system  to  wtiich  user  software  must  interface.  For  ex.imple, 

TF.l.SET  offers  almost  complete  .'>ppl  icat  ion-1-evel  tr  .anspar  ency  wli  i 1 e DR.S 
requires  a considerable  amount  of  information  from  the  appl  ic.it  ions 
programmer.  TELNET  is  fully  supported  by  systems  software  and  DR.S  is  not. 
The  difference  lies  In  tlie  way  tlie  description  of  passed  data  is  h.indled. 
Tills  issue  is  discussed  further  in  Chapter  III.  Tr  ansp.ar  one  y 

at  the  application  level  is  less  a function  of  the  scheme  used  to  handle 
messages  and  mote  a function  of  the  chosen  sclu>me's  implementation.  In 
tills  regard  the  three  translation  schemes  each  offer  the  same 

opportunity  for  application-level  transparency. 

Minimal  Host  Overhead  - Both  TELNET  and  FTP  interpret  the 

standard  data  format  being  employed  through  a translator  process  that  tuns 

in  the  netw^'ik  hosts.  Because  the  .ictual  d.it.i  translation  for  an 
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■ i pp  1 i c ,1 1 i on  i n<tepen<ien t Intermediate  data  format  would  be  pi’r  formed  as  a 
pr  »'pt  ocess  i step  for  all  messages,  and  because  the  number  of 
translators  required  fat  each  host  is  small,  translation  of  the 
intermediate  format  coul<i  be  witlidrawn  from  the  host  and  placed  at  a 
lower  functional  level  in  tlie  node.  Receiver  translation  could  also  bo 
p»-tfotmeii  at  a node  level  below  tlie  receiving  host,  but  the  large  number 
of  translators  requited  could  force  an  extra  degree  of  node  component 
sopiiist  icat  ion . An  intermediate  translator  eliminates  node  resident 
reformatting  overhead,  but  requites  one  or  mote  nodes  (and  liosts) 
dedicated  to  data  reconfiguration. 

Minimal  Message  Overhead  - Wli  i 1 e an  intermediate  translator 
eliminates  the  need  for  any  data  translation  at  tlie  communicating 
nodes,  the  installation  and  maintenance  of  and  communications  to 

special  purpose  reformatting  nodes  requite  additional  overhead. 
Receiver  translation  requites  exactly  one  data  reformatting  stage 

(that  at  the  receiver),  but  demands  tlie  management  of  a large  number  of 
translators.  An  intermediate  format  requites  exactly  two  data 
reformatting  steps,  but  greatly  reduces  the  number  of  translators  that  must 
be  maintained  over  that  for  receiver  translation,  and  so  is  the  most 
preferable  of  the  three  strategies. 

Reliability  - The  criteria  for  measuring  reliability  of  tlie 
proposed  mechanisms  are  the  number  and  complexity  of  critical 


42 


Facilitating  Inter  processor  Communications 


components.  E^ich  of  the  three  alternatives  requites  tlie  integrity  of  tlie 
two  nodes  between  wiiich  data  is  being  exchanged.  ITie  use  of  an 

intermediate  translator,  also  requires  that  a tliird  entity  to  perform  the 
d.rta  translation  be  functioning.  litis  schi'me  has  three  critical  modules 
wti  i 1 1'  both  receiver  translation  and  the  use  of  a standard  intermediate 

format  have  only  two. 

Hie  nodes  in  a receiver  translation  environment  must  maintain 
translators  for  every  format  used  in  the  network.  Tlierefore,  the  size  and 
complexity  of  their  network  support  software  and  hardware  is  greater 

than  the  package  required  in  an  intermediate  format  environment. 

Of  the  three,  this  tough  measure  of  reliability  favors  the  use  of  a 

standard  format. 

Figure  2-1  summarizes  the  evaluation  of  the  three  design  stages  witli 
respect  to  the  design  considerations  discussed.  The  figure  includes 
examination  of  both  the  "ideal"  implementation  and,  wiiete  applicable, 
an  existing  implementation  of  (>ach  strategy.  Ttie  taring  of  the 

strategies  for  each  consideration  is  qualitative.  Wliete  a strategy  is 
decidedly  more  responsive  to  a design  consideration  than  the 

alternatives,  it  is  marked  with  a "plus"  and  the  others  are  marked  with 
"minuses"  (e.g.  reliability).  Convets(;ly,  wiien  one  strategy  is  decidedly 
worse  than  the  others  in  a partlcvilnr  category,  it  is  marked  with  a 
"minus"  and  the  others  are  marked  "plus"  (e.g.  process  .rdd  t ess  1 ng)  . 
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For  somo  cons  uirt  at  Ions , the  specific  imj)!  ementat  ion  of  a sttatcjtv  is 
mat  k(‘(i  "minus"  and  tlie  goiunal  use  of  tliat  same  str.itegy  is  marked  wi  t li  a 


"plus."  Ttiese 

markings  indicate 

that 

wti  1 1 e tire 

• cur  r (*n  t 

imp  1 emen  ta  t ion  of 

the  strategy  is 

no  t 

r espons i ve 

to  tiiat 

consideration,  an 

extension/generallzatlon 

o f 

it  woul  d 

meet  tire 

designated  design  goal  (o.g.  sophistication). 
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Tile  discussion  in  the  preceding  section  motivates  the  us(>  of  an 
intermediate  data  format  to  facilitate  inter  processor  commun icat ion  in  a 
lieter  ogeneous  computer  netwr>rk.  Tliis  section  will  address  the 
meclianisms  needed  to  support  an  IDF  and  the  selection  of  standard  data 
t epr  esentat ions . 

Tlie  conceptual  view  of  Inter  pr  ocessor  communications  is  depicted  in 
Figur  e 3-1  . 
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Figure  3-1 . 


'PA'  and  'PB'  are  processes  residing  in  processors  'HA'  and  'HB' 
respectively.  Tlio  translators  are  responsible  for  any  reformatting 


necessary  between 

the 

data  representation 

used  by 

their 

correspond  ing 

ho  s t 

processors  and 

the 

network  standard. 
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data 
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format  movod  thron^jli  that  link. 


nu'tp  arr-  six  mops  to  t ho  siiooosstnl  tiansfor  of  data  from  ono 
pt  oross  to  ano  t ho  r . 

* tht'  sending  host  passes  data  to  Its  assooiatod  translator. 

* the  translator  reformats  the  data  to  conform  to  the 

network  standard  representation. 

* the  translator  passes  tho  refortnattod  message  to  the 
subnetwork  to  be  carried  across  the  network. 

* the  subnetwork  module  at  the  receiving  node  passes  the 

incoming  message  to  Its  associated  translator. 

* the  receiving  translator  reformats  the  messagr-  from  the  IDF 
to  the  representation  appropriate  for  its  host  processor. 

* tlie  receiving  translator  passes  the  reformatted  mi'ssage  to 
the  receiving  process. 

To  perform  its  function,  the  translator  Is  passed  a buffet  in  an 
input  data  format  and  builds  a buffer  containing  the  reformatted 
information.  Tire  translator  can  be  broken  down  corrceptual  ly  in  the 
following  way  (Figure  3-2). 
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Ttie  St  nnd.n  d i 7.n  r ion  of  an  Intormod  into  notu^jrk  formal  in  turn  allow,  the 
standardisation  of  tho  network  interface  section  of  the 
translator.  Similarly,  the  portion  of  the  translation  section  that  is 
tuned  to  the  IDF  will  be  transportable  among  translators.  Tlie  host 

interface  and  its  half  of  the  translation  specifications  will 

necessarily  be  host  dependent. 


3.1  Data  description 

Thi  ’•  e are  tw>  aspects  of  data  description  for  a data  item.  The 
first  is  the  definition  of  the  data  class  or  type  of  each  i*^('m,  such  as 
character,  integer,  or  instruction.  In  general,  a string  of  bits  c-arties 
no  indication  of  the  kind  of  data  item  it  is  Intended  to  represent. 
This  is  because  the  over  wiielm  Ing  majority  of  currently  available 
computer  systems  are  based  on  the  Von  Neumann  philosophy  for  storing 
digital  information.  These  systems  do  not  rely  on  any  inherent 
distinction  between  the  internal  storage  of  different  data  types  to 

manipulate  information.  Rirther  , the  semantic  meaning  of  a string  of 
bits  is  derived  solely  from  the  context  in  wlrlch  the  bits  ate  used. 

The  Von  Neumann  form  st.ites  that  data  and 
program  are  indistinguishable.  Tills  form  assumes  fixed 
size  binary  words  or  cha  r ric  tr*t  s [bytes]  wtilch  allow 
programs  to  be  treated  as  data.  These  computational  irn  i t s 
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ate  manipiilable  by  a large,  general  purpose  set  of 
operations.  Meaning  is  not  inherently  represented  in  tire 
contents  of  tliese  units;  rather  it  is  assigned  to  tire 
contents  of  these  units  by  tire  program  manipulating  tlu'm. 

[ <Feus t a 1 2>  p . 1.1 

By  r-xample,  an  8-bit  string  sent  to  a teletype  may  ring  a hell,  wli  i 1 e that 
s.rme  string  may  be  moved  to  an  arithmetic  unit  to  represent  an  lnti>ger 
value.  Moving  the  same  bits  to  the  instruction  register  of  a processor 
may  cause  yet  another  effect.  It  is  the  use  of  the  bits  that  defines 
their  data  type,  and  not  the  bits  themselves. 

The  second  aspect  of  data  description  is  the  specification  of  the 


internal  data  format  used  to 

r epr  esen  t 

the 

value 

of  an  item  of 

1 

particular  data  type.  "Integer" 

is  a d a t a 

type  . 

"Two' 

s complement ," 

on 

the  other  hand,  is  a data 

format 

used 

in 

many  processors 

Lo 

represent  integer  values.  Other 

formats 

used 

for 

integers  i nc 1 ude 

sign-magnitude,  one's  complement  and  decimal.  Saying  that  a lb-bit  data 
item  is  an  integer  specifies  its  type,  but  not  the  format  used  to  encode 
the  value  it  represents,  and  so  is  an  Incomplete  description  of  the  item. 
Describing  the  item  completely  reqitires  the  inclusion  of  both  the  data 
type  and  the  Internal  format  of  the  data  item,  i.e.  the  sixteen  hit  item 
is  .an  integer  represented  in  two's  complement  format. 

Just  as  with  the  data  type,  determination  of  the  data  format  used  to 
(•ncode  the  value  of  an  item  cannot  be  maile  by  examliing  the  data  hits. 
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Win'll  two  d.ita  items  are  added  together  in  a ptoeessot  tliat  supports 

inti'ger  addition,  tlie  operation  identifies  them  as  integers  and  the 
at  cli  i tec  t ur  1 1 design  of  tlie  "add”  instruction  identifies  their  lormat. 
(’DP-Il's  support  two's  complement  integer  addition.  S/WO  supports 

addition  both  of  two's  complement  and  liocimal  format  integers;  in 

this  machine,  integers  may  be  represented  in  eitlier  of  thesi>  two 

f o r m.i  t s . 

Tlie  data  reformatting  function  replaces  the  bits  that  reflect  a 
value  for  a data  type  in  oni*  representation  with  tlie  bits  tliat  denoti'  the 

same  value  in  a second  representation.  (For  a discussion  of  how 

diffi-iilt  this  mapping  can  be,  refer  to  Chapter  IV.)  To  faithfully 

reconfigure  an  input  data  stream,  a format  translator  must  know  how 

those  bits  were  interpreted  in  the  computing  system  environment  from 
wliirh  they  originated.  Tliat  is,  tlie  translator  must  know  whether  to 

treat  the  incoming  8-bit  bytes  as  KBCDIC  characters  or  as  quarters  of 
'32-blt  one's  complement  Integers.  As  the  data  bits  themselves  catty  no 

indication  of  their  type  or  format,  they  alone  cannot  specify  wliich  data 
translation  scheme  must  be  appl  i<*d  to  interpret  them.  A 

description  of  the  type  of  data  being  moved  and  the  internal 
representation  scheme  used  for  each  type  must  he  av.ii  fable  to  the 

translator  to  facilitatt"  the  proper  r r-lormat  t ing  . 

Tlie  Information  needed  to  form  such  a description  for  each 
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inossast’  bolriR  sent  aeross  a network  is  known  (’xe  I ns  i vel  y by  lire 

t r ansin  i 1 1 i nR  process.  Wli  i 1 e the  rcceivinj;  process  may  be  expertini;  a 

message  containing  particnlat  data  it(>ms,  there  is  no  assurance  that  an 
incoming  message  conforms  to  that  expectation.  Tire  di-sctiption  of  data 
items  anticipated  by  tire  receiver  is  important  to  error  chec'.ing  and 
process  synchronization.  However,  insuring  a precise  transfer  of  data 
items  witli  their  <rriginal  values  requires  tire  sole  use  of  a data 

description  supplied  by  the  sending  process. 

A semantic  description  of  the  data,  tlien,  as  well  as  the  data 
being  tr.insraitted  must  cross  three  communications  links;  from  tire 
sending  process  to  its  translator,  from  that  translator  to  the 

translator  for  the  receiving  process,  and  from  there  to  the  receiving 

process  itself.  Tliese  links  can  bo  broken  into  two  categories:  the 

host- 1 r an  s 1 a to  r connection  and  the  tr  anslato  r - 1 r airs  1 ato  r connection.  T'le 
distinction  is  important.  Airy  protocol  used  to  exchange 

information  between  a host  and  its  associated  translator  must  reflect  the 
host's  data  formatting  needs.  Such  a protocol  is  constrained  by  the 
-specific  characteristics  <rf  that  host.  Wli  i 1 e the  type  of 
information  to  be  carried  by  a t r ansi ator -to- tr ansi ato r protocol 

iep<nds  on  the  hosts  included  in  the  network,  the  form  each  type  of  data 
must  take  in  the  protocol  is  totally  Independent  of  all  «>xisting 

equipment.  Strategies  for  transporting  data  descriptors  must  be 

evaluated  in  light  of  both  of  thesi-  connection  cati'gories.  Tire  test  of 
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tins  st'rtion  discussi-s  two  .il  tor  nat  lv(>  st  t atop,  ios  for  passing  t ho 
ni'oossarv  information. 


'3.1.1  Passing  description  by  pt  oar  rangomont 

()no  strategy  for  passing  tiro  semantic  description  of  a string  of  lat.r 
bits  is  to  rely  on  a pr eat r angomen t of  the  bit  stream  format.  Tie 

mechanism  is  best  explained  by  example.  Two  communicating  modules  may 
igteo  that  the  next  stream  of  bits  transmitted  will  form  16-bit  two's 

complr-ment  integers.  The  sending  module  then  transmits  sixty-four 

data  bits.  Tie  receiver  breaks  the  Incoming  data  stream  into  four 
ti-bir  twi's  complement  integers,  assuming  that  to  be  tiie  type  and 

i'taat  of  the  data  being  sent.  If  tlie  pt  eat  r angomen  t mi'ctianism  has 

f in.  t t.Hied  correctly,  the  receiver  has  been  successfully  transmitted  tne 
context  and  substance  of  the  four  data  items.  In  a mote  complex 

situation,  two  communicating  modules  may  liandle  a bit  stream  that 
represents  a specific  mix  of  data  items  of  different  d.ita  types  (e.g.  the 
first  item  is  a 16-bit  two's  complement  integer,  the  second  item  is  a 
7-bit  A.SCII  character,  etc.). 

I’r  ear  r angement  of  the  type  and  format  of  the  data  items  in  a bit 
stream  can  be  liandled  in  two  ways.  First,  the  cliaracter  of  the  stream  can 
be  fixed  by  system  design  ;rnd  implementation.  For  example,  one  data 

tr'trainal  can  only  understand  7-blt  ASCII  charaiters,  wlille  another 
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I'.in  only  undft  si  and  H-hit  F.BCniC  clmt  ar  tot  s . A mismatch  in  design 

prevents  the  transmission  of  meaningful  information  from  one  to  tire  other. 
Second,  tw>  communicating  modules  can  select  a format  for  tire  bit  stream 
through  format  option  negotiation.  The  semantic  content  of  tire  passed 
hit  stream  is  still  described  by  pt eat r angemen t . However,  format 
negotiation  allows  both  the  types  of  data  items  being  moved,  and  the 
formats  in  wiiich  they  are  encoded  to  cliange  dynamically  as  conditions 
Witt  ant  . 


ARPANKT  implementation  of  the  TKI.NET  and  FfP  protocols  ate 

examples  of  the  pt  ear  t angement  technique.  The  original  TKl.NKT 

spe  ■ i f icat  ions  requited  each  inter-host  TEI.NET  transmission  to  format  data 
in  the  7-hit  DSASCII  character  set.  For  two  communicating  TKl.NF.T 

processes,  a description  of  the  trans-network  data  stream  was 
specified  by  the  system  design.  Tliat  is,  by  pt  eat  t angemrurt  , tiu'  <iata 

stream  was  a stream  of  USA.SCII  cliaracters. 

The  development  of  negotiated  options  offered  a natural  extensiirn  to 
the  TEI.NET  data  description  strategy.  Rather  tlian  forcing  two  TEI.NET 


processes  to  communicate 

through  a 

design 

determined 

format,  negotiation 

allowc'l  the  selection  of 

a data 

St  r earn 

format  to 

be  delayed 

until 

just  before  each  data 

tr  ansfer 

wa  s to 

t a ke  p 1 ac  c . 

Tir  i s wa  s a 

step 

towards  increasing  communications  flexibility.  Data  description  was 
still  accompl  1 sired  by  pt  eat  t angr'ment  between  two  TEI.NET  processes. 
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but  o.ich  option  (or  combination  ot  options)  roprosontod  ,i  diifc-icnl 
doscription  tb.it  could  bo  nppliod  to  the  dat.n  stro.im  boing  t r .in  sm  i 1 1 od  . 
t>nco  two  communicating  processes  agreed  on  a set  of  options,  each  could  bo 
cort.iin  that  the  receiving  TKl.NKT  was  applying  the  correct  data 
doscription  wlion  interpreting  the  data  bits  being  sent. 

T*io  TKl.NKT  strategy  attacks  the  problem  of  carrying  .i  data 

doscription  across  a network  from  one  host  to  another.  Tlio  data  typo  of 
ttie  items  boing  passed  is  fixed  as  ch.it  actors.  TTi  i s permanent 

tequirc'ment  of  a single  data  type  represents  data  description 

ptearrangement  b>  system  design.  Tlie  data  format  representation  of  the 

characters  being  exchanged  is  also  described  through 

prearrangement  . Instead  of  being  fixed  at  design  time,  however,  thi’ 
format  of  the  characters  being  transmitted  can  be  respecified  through 
option  negotiation.  The  character  formats  currently  .ivailable  is 

TKI.NKT  negotiable  options  include  standard  USASCII,  extended  ASCII  and 
bin.iry.  Under  TKl.NKT,  r enegot  i.it  ion  is  allowed  before  evi>ry 

transmission  and  so  the  format  agreed  upon  may  change  as  often  as 
every  mess.ige. 

Ptearrangement  with  optional  format  negotiation  h.is  proven  vi-rv 

successful  for  TKl.NKT.  However  it  seems  untenable  for  a general 

communications  scheme  wliere  it  is  necessary  to  communlc.ite  data  of  m.iny 
different  types.  The  rigidity  Imposed  by  tot.il  agreement  through  original 


5/4 


An  IntiTined  late  Dat  a Format 


do.sign  is  ,rn  iinarcoptab  1 e base  for  a gonetal  fat'll  Ity.  Tire  only  flcxlblo 
form  of  pr  oat  t angcmon  t appears  Lo  be  option  negotiation  as  impl  (‘mented  for 
TKI.NF.r.  This,  however,  would  be  a cost  1 y mt^chanism  for  goiif>ral 
interprocess  communication.  TKI.NKT  is  already  characterized  by 
proliferation  of  options  <AR1’A>.  Tlie  introduction  of  each  now  dat  a type 
can  be  expected  to  be  accompanied  by  its  own  set  of  options.  Indeed,  each 


data  type 

wiuld 

itself  r ept  esen  t 

an  option. 

Support  for  messages 

composed  of 

i t ems 

of  mixed 

data 

types  would 

r equ i re  a fur  the  r 

•'X  tension 

of  the 

negot iat ion 

scheme 

to  handle  composite  messages. 

Wliile  pr  eat  t angemont  of  data  type  and  format  is  unwieldy  for  a 
system  requiring  generality,  it  may  be  adaptable  to  some  instances  of 
•lost  - 1 1 ans  1 ato  t connections.  An  example  of  such  a case  is  a network 

connection  to  an  unintelligent  peripheral  device  such  as  a data 

terminal.  A terminal  has  neither  the  means  not  the  need  to  win  format 
flexibility  through  option  negotiation.  It  must  send  and  receive  bit 
streams  that  conform  to  a fixed  data  description.  In  this  example,  the 
translator  must  match  the  data  type  (character)  and  data  format  (ASCII, 
KHCDIC,  etc.)  of  the  terminal  at  all  times.  Such  a binding  is  a natural 
application  for  using  prearranged  data  description  to  pass  the  semantic 
content  of  data  bits. 
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i.i.2  K.issinf;  dosctiption  by  d.ita  taf’RiiiR 

nu-  st'cnnd  a!  tc  t n.it  I vo  for  passing  Ibo  description  of  .1  bit 

stte.im  is  to  mark  e.ich  data  item  in  a m<‘ss,i>’{'  as  to  its  type  .uid 

format.  In  such  a scheme,  each  datum  in  .i  transmitted  message  h.is  .1 

standard  data  descriptor  associated  witlr  it.  Embedding,  tlrese 

descriptors  in  the  actual  data  stream  so  that  they  proceed  items  wfiich  they 
dt'scribe  .allows  the  receiver  to  delimit  and  determine  tire  type  of  each  item 

separately  as  it  is  delivered.  As  long  as  it  is  paired  with  its  data 

descriptor,  each  piece  of  data  is  a totally  self-describing  item. 

The  support  of  self-describing  data  insures  against  tire  separation  of  a 
message  and  its  context. 


Precision  in  data  transfer  permits  sem.intlcs  and 
structural  information  wtiich  exists  in  the  sender's  instance 
of  a datum  to  bo  teptoduced  in  the  receiver's  image  of 
the  datum,  even  though  It  may  be  represented  in  the 
systems  Involved  In  entirely  different 

fash  ions ...  Data  of  a given  type  should  he  recognizable  as 
such  [by  a receiver]  without  the  need  for  context... A 

particular  service  can  achieve  data  precision  by  meticulous 
specification  of  the  protocols  by  wlilch  data  is  transferred. 
This  need  is  widespread  enough,  however,  that  it  is 
appropriate  to  consider  inclusion  of  a facility  to  provide 
data  piocision  within  the  mechanism  Itself.  [<H.iverty>  pp . 
8-9.) 
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Ttu>  tfchnlquo  nf  attarhinj’  dt-sc  t 1 plot  s to  d.iti  Itinrih  tt>  m.ikc  llicm 
sc  1 t -di’sc  r i h I up,  hns  been  i n I 1 cd  "t.inp.inn"  •'Kciist.ill,  Fi>iist.il2, 
111  i t c>  . 


Ingoing  does  not  tot.illy  eliminate  the  need  for  data  description  by 
pr ea r r angeraent  ; it  only  moves  the  agreement  to  a different  ’eve!  of  data 
handling.  Instead  of  requiring  preart angement  of  the  content  of  each  data 
stream  passing  betwi*en  communicating  modules,  a scheme  based  ern  data 
tagging  allows  any  message  to  contain  any  legal  combination  of  data 
descriptors  (tags)  and  data  bits.  Ilie  tags  in  the  message  describe 


data  being  transmitted  and 

no 

pre-message  agreement 

of  tile  items 

in 

the  message  is 

necessat  y . 

Requlr  ed  , 

however  , 

i s 

a;reement  of 

the 

meaning  and 

form  of 

the 

data 

tags  . 

Kv  0 T y 

commun ica  t i ng 

module  must  understand 

and  conform 

to 

the  use 

of  tags 

to 

describe  data 

bits.  Without  common  agreement  at  tliis  level,  messages  built  out  of 
self-describing  data  would  be  unintelligible. 

To  send  four  Ib-blt  two's  complement  integers,  a transmitting 

nodule  builds  a mess.ige  out  of  the  sixty-four  data  bits  and  the 

"lb-bit  tvAi's  complement"  tag.  For  8-bit  tsgs,  this  would  mean  a 
message  length  of  9b  bits.  Tlien  the  module  transmits  the  message.  Tlu' 

receiver  dr'tects  the  tag  that,  through  pr ear t angement  , designates  a lb-bit 
two's  i-omplement  integer  and  uses  the  sixteen  bits  that  follow  it  to 


- b7  - 


•—» 


An  In  to  r mod  i .1  r n D.itn  Format 


form  thi-  immbot  . I.ikowiso,  tho  otlior  lliroo  d.ata  items  are  delimited, 
and  tile  transmission  of  tlio  font  intOf’,ors  is  snecossful. 

Sol  f-desrt  ipt  ion  more  readily  facilitates  K^'ieral  int  er -pr  ocessor 
eoramiin  irat  ion  by  allowing  data  streams  of  mixed  data  types  to  appear  in 
messages.  Tlie  overhead  associated  with  supporting  the  negotiation  proci*ss 
is  replaced  with  the  overhead  of  encoding,  decoding  and  moving  the  I'xtra 
bits  required  for  the  descriptors. 

Self-describing  data  involves  a corruption  of  tho  data  stream  with 
the  item  descriptors.  Wli  1 1 e it  provides  full  flexibility  of  message 
format  and  content  and  is  acceptable  to  intelligent  commun  icat i ng 

modules,  the  scheme  is  not  at  all  appropriate  for  the  data  needs  of  a 
simple  peripheral  acting  as  a host.  Although  special  equipment  could  bo 
built,  most  currently  available  unintelligent  devices  cannot  tolerate 
communications  strategies  that  requite  modification  of  the  data 

handling  protocol.  Tills  Includes  Interpreting  or  even  simply 

discarding  descriptors  embedded  in  the  data.  For  such  devices,  taging 
Is  an  unacceptable  proposition. 

Because  data  tagging  allows  more  flexibility  than  option 

negotiation,  the  use  of  data  descriptors  to  build  self-describing  data  is 
preferable  for  t r ansi  a to  r - to- 1 r ans  1 a tor  connections.  Wli  I 1 e this  scherar* 
offers  the  same  advantage  wlien  applied  to  host-t  r ansi  ator 
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iHiru'c  t ions , sonu“  potential  hosts  are  unable  to  perform  the  messap.e 

processinj^  necessary  to  support  it.  "Hie  description  passing 

stratepies  used  in  each  host-tr anslator  link,  therefore,  must  be 
selected  to  allow  the  translator  to  support  its  associated  host  in  the  most 
reasonable  fashion. 

3.2  The  standard  format 

An  intermediate  data  format  (IDF)  is  Intended  to  provide  two 

processors  using  dissimilar  internal  data  representations  with  a 
common  ground  for  information  exchange.  The  basis  for  their 
communications  is  the  standard  intermediate  data  format  in te t pt et ab 1 e of 
data  stream  descriptors.  This  section  discusses  the  selection  of  the 
data  formats  to  act  as  the  intermediate  representation  of  each  data 
type  to  be  moved  among  the  translators. 

In  the  ideal  sense,  the  choice  of  intermediate  formats  can  be  made 
arbitrarily.  Messages  with  data  items  represented  in  a standard  format 
pass  only  between  network  translators  designed  spec i f ic.il 1 y to  handle 
•tiatever  format  is  picked.  Only  factors  of  economy  constrain  that 
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An  In  t n rmnd  la  t <■  Data  Format 

1.2.1  ASCII  cliarart»‘t  t opt  o.sentat  ion 

Two  altornative  fo  ttnat  t I ng  srhomos  h.tvo  boon  proposod.  llic  first 
schome  involves  the  t r ansm  1 ss  Ion  of  earh  .iata  item  in  an  ASCII 

char.acter  representation  of  its  value  • K imb  1 et  on  1 , Teat»er>.  Kvery 
datura  ha.s  a human  teadabir’  form  th.it  . ,in  he  built  .is  .a  ch.ar.ii'ter 
string.  Tlie  scheme  proposes  tills  form  it  .is  the  interrrn'd  i.rte 

representation  of  tiie  item.  To  tr.insmit  i srn.il  1 floating  point 

nirrabet  , for  example,  a sending  translator  wioild  hro.iilrast  the  ASCII 
characters  representing  tiie  sign  .and  integer  p.irt  of  the  data  item,  an 
ASCII  period  for  the  decimal  point,  and  fin.illy  the  ASCII  characters 
representing  the  fractionil  part  of  the  number  being  p.issed  . 

The  most  convincing  .irgument  for  tlie  use  of  an  intermediate  data 
format  based  on  the  ASCII  character  set  Is  the  already  wide-spread  use  of 
this  format  for  the  internal  representation  of  human  readable 

information.  By  performing  information  Input/output  functions  with 
ASCII  characters  to  terminals  and  line  printers,  processors  are 

already  required  to  support  translation  between  ASCII  character 

strings  and  the  machine-dependent  internal  representations  for  other  data 
types.  Choosing  a standard  format  for  wliich  many  processors  already 
support  so f t war e/ f I rmwar e translation  can  greatly  simplify  the  initial 
development  and  the  continual  maintenance  of  the  IDF  translator 
module  associated  with  each  network  host. 
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A St  i y iisinj;  the  ch.ir  .ir  ter -based  intermediate  format 
mtHhanisra  for  AHPANKT  has  been  siu’'^('sce<i  to  f.irilitate  th('  transfer  of 
records  to  and  from  data  files.  P strateRv  assur.res  that  tire 

tiimtiorr  to  be  srrpported  Is  the  transfer  ot  data  file  records  from  tlr-’ 
secomiary  siorap.e  of  one  system  to  anotiier,  aird  that  such  data  files  are 
ass(U'iat(‘d  with  a data  description  of  the  records  (type  and  frrrmat  of  the 
Items  in  a record). 


Predicated  upon  the  existence  of  a suitable  loj’ical 
data  description  of  the  file  being  accessed  , the 
following  four  step  approach  to  data  translation  in  a 
netv^uking  environment  seeirrs  r easonab  I e . . . Tire  four  steps 
ate: 


using  the  access  method  originally  used  to  write  the 
file  to  retrieve  the  desired  record  at  the  source 
site, 

using  the  logical  data  description  of  the  record 
together  with  knowledge  of  the  1/0  routine 

originally  [used  to]  write  the  file... to  transform 
the  record  from  the  form  in  wtilch  It  Is  internally 
stored  to  a charaettit  normal  form  analogous  to  that 
in  wlilch  the  record  would  be  listed  by  a line 
pr inter  , 

using  variants  of  existing  ARPANKT  protocols  to 

transmit  the  record  from  source  to  destination, 

using  at  the  destination,  the  record  and  its  logical 
data  description  to  reconvert  from  the  character 
normal  form  to  that  used  for  Internal  storage  of 
Information  (corresponds  to  the  usual 

transformations  performed  in  supporting  data  entry  to 

go  from  the  manner  In  wtilch  data  is  entered  to  that 
in  wtilch  it  Is  stored).  (<K  Imbl  eton1>  pp . 2-7  to 

2-8.  ] 
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Tlius . tills  is  ,111  oxamplo  of  a situation  wtioro  all  data  items  are 
translated  into  the  same  format  and  the  actual  description  of  data 
types  is  p.rssed  by  pr  eat  r an>;ement  . Similarly,  it  would  he  possible  to  use 
the  tap.^in^  strategy,  wliete  the  tag  would  describe  tin-  actual  type  of  an 
ii"m  being  tr.insmitted  as  a string  of  characters. 

1.2.2  Format. s based  on  data  types 

The  alternative  to  a character  based  intermediate  format  is  the 
definition  of  a set  of  data  representations  with  the  formats  best 
suited  to  each  type  of  data  item.  An  integer  migtif  b<“  represented  in  a 
32-bit  two's  complement  format,  as  an  example.  Characters  would 

probably  use  the  standard  ASCII  character  set. 

An  advantage  to  establishing  a different  i n ter  mi>d  ia  te  format  for 
every  data  type  is  that  in  many  cases,  the  data  translations  can  be 
relatively  straightforward.  Because  most  processors  operate  on  two's 
complement  numbers,  for  example,  choosing  a two's  complement 
intermediate  format  for  integers  will  necessitate  only  the  simplest  of  data 
reformatting  for  many  machines. 

Tire  selection  of  one  of  these  two  alternatives  for  the  design  of  an 
intermediate  format  only  impacts  the  implementation  of  the  network 
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tr.inslntors  .imi  the  tiso  of  tho  conirniin  icat  inns  links  tliat  ronnort  thorn.  Tiio 
data  passing  sttato>»y  onfotros  1 1 an  spa  r t>nr  y c>  f tlio  i n to  t mod  i a to  format 
■It  'ill  othor  systom  lovt'ls.  Ttic  two  srhomos  must,  tliorofori',  bo 
ovaluatod  with  rospoct  to  mossago  ptocosslng  ovotiioad  im-nitod  at  flio 
translator  modulo  and  tiro  data  transmission  ovorhi-ad  roquirod  to  (-arty 
information  across  a network. 


Processing  overhead  at  the  translator  is  a direct  function  of  the 
complexity  of  the  required  data  translations.  In  general,  converting  f r erra 
one  machine  readable  form  to  another  is  simpler  than  manipulating  character 
string  representation.s.  Translating  a 24-bit  one's  complement 
integer  into  12-bit  two's  complement  representation  is  cert  rinly 
easier  than  translating  that  same  integer  into  as  many  as  64  bits  (eight 
characters)  to  form  the  ASCII  string.  Tli  i s overhead  has  prompted  the 
support  of  a high-level  language  option  to  allow  human  user  controlled 
specification  of  the  internal  format  to  be  used  to  store  application 

program  data.  Ttic  user  is  recommended  to  store  data  items  predominantly 

used  in  calculations  in  binary  or  decimal  format.  Items  that  are  required 

extensively  In  human  readable  I/O  operations  may  be  stored  in  'picture' 

format  to  facilitate  their  conversion  to  character  strings  (<PLI>  p. 

222.  ] . 

Tliere  Is  no  question  but  that  the  human  readable  form  of  most  data 
requites  ,i  longer  bit  representation  than  common  machine  readable  formats 
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tho  s.inu’  ci.ita.  Tlio  incroast*  in  rai's.s,i)>t>  li>ngth  thtoiinii  the  uso  of 
bit-wise  ine  f f ir  ien  t d.-ita  fotra.its  is  inrluded  in  rommnn  ir.nt  ions  connection 
overiie.id.  Tite  use  of  inefficient  form.its  increases  network  resource 
contention  and  decreases  message  thr nput  affecting  botti  cost  and 
pet  formance . 


Wtiile  tlie  existence  of  support  for  ASCII  format  tr.inslatlon  in  m.iny 
processors  encourages  use  of  an  ASCII-based  scheme,  simpler 

t r .ans  1 a t ions  .and  the  overhead  Issues  weigh  licavily  in  favor  of 

Jata-type  dependent  intermediate  formats.  Were  a scheme  requited 

solely  to  tr.an.srait  file  records  from  processor  to  processor,  the 
character  based  formatting  described  above  would  be  attractive. 
Instead  of  building  a data  descriptor  for  e.acii  record,  tlie  information 
passing  utility  could  use  the  file  descriptor  alte.ady  .assoc  i.ati'd  with  each 
file  record.  The  items  in  the  d.ata  record  would  tiien  h.ave  to  he  forced  to 
conform  to  the  format  presented  in  the  record  descriptor,  i.e. 
translated  into  their  ASC 1 1 -cliar  ac  ter  equiv.ilents . However, 

in t e t - pr ocesso t communications  require  mote  general  message  content.  For 
these,  the  char ac ter -based  format  is  more  costly  than  a scheme  based  on 
selecting  an  applicable  format  for  each  type  of  data  to  be  transmitted. 
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5.1  A possible  IDF 

This  rliopter  has  stiown  that  t lie  most  leasonalile  choice  for  a 

standard  i n t e r mcii  i a t e data  format  is  one  that  transfers  data  items  that 

are  tap,>;ed  with  their  type  and  format.  Further,  tor  reasoivr  of 

efficiency,  tin-  format  used  to  transmit  the  value  of  each  data  item 

should  be  natural  to  that  item's  data  type.  Kvery  itr'm  in  an  IDF  of  this 
form  has  two  parts,  a data  type  tag  and  a data  value.  For  a simple 

data  item  such  as  an  integer  or  a character,  tlie  "valu<>"  portion  of 

the  item  can  merely  be  the  actual  number  or  character.  A distinction 

must  be  made,  however  , between  primitive  data  types  and  composite  data 
types . 


Figure  1-3  depicts  an  IDF  representation  of  the  letter  "a".  Tlie  left 
half  of  the  item  contains  the  type  tag  for  character  . Tliere  is  ex  ictlv 
one  such  tag  for  every  data  type,  and  since  their  use  is  limited  to 
the  IDF  translators  the  assignment  of  tag  values  to  data  types  Is 
totally  arbltraty.  Tlie  tag  value  is  symbolized  here  by"CHAK".  The 
right  half  of  the  item  contains  the  character  being  represented  in 
the  intermediate  format. 
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I’timUive  Dnta  Typo  Format 
FiRuto  i-3. 

Bosidos  intojjfts  and  chntactors,  the  list  of  ptimitivo  data  tvpos 

inrliKios  iti-ms  such  as  double-  procision  into>»ots,  flo.atin>»  point 

numbots,  and  boolean  values. 

The  tagging  mechanism  also  supports  the  transfer  of  data 

structures.  Although  the  set  of  compound  data  types  that  should  be 
supported  is  not  clearly  defined,  composite  data  types  such  as  arrays  or 
general  data  structures  could  be  constructed  from  these  primitive  types. 
Figure  3-4  is  an  cx;imple  of  an  IDF  representation  for  an  array.  The 
tag  value  for  an  array  is  specified,  and  the  "int"  following  it 

indicates  that  this  is  an  array  of  lb-bit  integers  in  two's  complement 
format.  The  next  three  fields  describe  the  number  of  dimensions  and  then 
the  range  of  those  dimensions.  All  of  this  is  followr-d  by  the  data 
v.al  ues  themselves. 


1 array  : int  ; 2 : 3 : 2 : b integer  values 


3 X 2 array  of  integers 
Figure  3-4. 


An  Intermediate  Data  Format 


Tlie  selection  of  the  optimal  intermc?d  late  format  for  eacli  data 
item  is  a topic  that  requites  further  investigation.  Wli  i 1 e there  is  some 
question  as  to  the  numbet  of  bits  required  in  each  item,  the 

ovetUielming  use  of  two's  complement  arithmetic  in  commercial 
processors  indicates  that  the  intermediate  formats  for  integers  sliould  be 
based  on  two's  complement.  The  same  is  true  for  ASCII  and  the 
intermediate  representation  of  characters. 
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A data  format  is  <i  scheme 

for 

r epr  esen  t i ng 

t he  va  1 tie  o f 

data 

items  of  a particular 

type 

in  a 

convenient  way. 

For  example,  both 

t he 

familiar  Arabic  digits 

and 

Roman 

numerals  are 

data  formats 

fo  r 

representing  the  values 

of 

t ho 

counting  numbers 

Translation  is 

the 

process  by  which  infonnation  in  an  input  data  format  is  mapped  into  its 

corresponding  representation  in  an  output  or  target  data  format.  A 
simple  example  of  this  process  is  the  conversion  of  'XXV'  to  '25'. 

Data  translation  can  take  place  at  two  different  levels.  First  is 
the  mapping  of  a data  item  of  a particular  type  and  format  into  a 

different  format  for  the  same  type.  Tliis  is  the  case  of  translation 

between  one's  and  two's  compl£*ment  integers  or  between  Arabic  and 

Roman  numer.al  counting  numbers.  The  second  level  of  translation  is 

mapping  a data  item  in  one  representation  into  a representation  for  an  item 
of  a different  data  type.  This  process  Includes  converting  integers 
into  floating  point  values,  or  converting  numeric  characters  into  n'lmbers. 

Both  levels  of  data  translation  are  required  to  support  general 
interprocess  communications  in  a heterogeneous  environment.  As  a data  item 
Is  moved  from  one  processing  environmr-nt  to  another,  its 

representation  must  change  to  meet  the  data  r opt  e.s»'ntat  ion  constraints  of 
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its  new  iiost  . Wliethct  tiiis  involves  mappin>’  into  a difforent  format 
for  the  same  data  type,  or  translating  into  a representation  for  a 

completely  different  type,  will  depend  on  tire  data  types  and  formats 
supported  in  the  new  environment. 

A.S  already  stated  in  the  section  on  data  description  in  Chapter  III, 
the  data  type  and  format  of  an  item  represented  by  a string  of  bits  is 
established  wtien  those  bits  ate  used.  Wlrether  they  appear  as  an  operand 


to  .1  one's  complement 

addition  or 

as  an  address 

loaded 

into 

tlie  program 

counter,  these 

bits 

c an  just 

as  well  bo  sent 

to  a 

1 ine 

pt inter 

as  a 

character  in  the 

nex  t 

instant . 

Tire  conclusion 

to  be 

d r awn 

from 

th  is 

is  that  the  software  tunning  on  a processor  gives  semantic  meaning  to 
the  data  items.  It  is,  therefore,  the  software,  not  the  hardware,  that 
determines  the  data  representations  supported  in  a processing 
env  ir  onmen  t . 

ITie  point  is  easily  argued.  A typical  minicomputer  has  lb-bit 
registers  and  an  assembly  language  instruction  that  performs  two's 

complement  addition  on  16-bit  operands.  But  with  the  proper  software,  this 
processor  can  be  made  to  perform  18-bit  one's  complement  or  even  64-bit 
decimal  floating  point  addition.  True,  the  hardware  facilitates 
the  manipulation  of  16-bit  two's  complement  integer  data  items,  but  the 
h.itdware  does  not  necessarily  restrict  tin?  type  .ind  form  of  data  that 
ran  be  lnt«‘tpreted  by  a processing  system  It  hosts. 
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A1  1 

o f 

the  problems  assocIat»‘d 

with 

d .1 1 .1  t t .ins  1 a t 

ion  are  the 

r esu  1 1 

of 

lnform.it  ion  moving 

bet  woi'ii 

processing  onv 

i r nnmon  t s t li.u 

suppor  t 

ll  I f t 

erent  data  types  ,ind 

for  ra.it 

s.  Tlie.se  <>nv 

1 1 onmon  t s a t e 

shaped 

by 

software,  and  to  s.ty 

tb.it  a 

ptoc»‘ssor  does 

not  support  a 

p.ir  t icnl  ar  data  typo  moans  only  that  tiroto  Is  no  Intoll  i>»onc’o  in  placo  to 
handlo  data  items  of  that  form. 

Performing  prertso  data  transfer  requires  the  accurate  and 

complete  movement  of  all  of  the  Information  in  the  items  being 
transferred.  Wlren  items  must  be  exchanged  by  processors  that  support 
different  sets  of  data  typos  and  formats,  translation  problems  can 
occur.  The  test  of  this  chapter  examines  three  of  these  problems,  in 

particular,  precision,  format  incompatibility,  and  data  typo 
incorapa t lb  i 1 1 1 y . 


•4.1  Pr  or  1 slon 


The  precision 

of 

a data  type  lormat 

Is  a measure  of  the 

t ange 

of 

values  that  can 

be 

I epr  osented 

in  that  format  . 

For  binary 

d .1 1 a . 

the 

amount  of  Information 

in  an  item  is 

the 

number  of 

signl f leant 

bits 

the 

current  valut  of  that  Item  contains.  Precision  problems  exist  wtien  the 
input  and  target  formats  for  a translation  are  formats  for  the  same 

data  type,  but  can  represent  different  ranges  of  values  of  that  data 
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It  ii.it  a Itoms  t ept  psontod  in  a bigli  ptocision  format  aro 
t t an  s 1 .1 1 I'll  into  .i  format  of  lowrt  ptorislon,  a loss  of  information  nan 
oci'ui  . 

for  computprs  to  exchange  data  of  a particular  data  t ype > 
translations  may  have  to  be  performed  between  data  formats  b.ased  on 
different  word  lengths.  Some  values  that  can  he  represented  in  '16-bit 
words,  for  example,  have  no  representation  in  16-bit  words.  items  in  tlie 
l.irger  fotm.it  that  require  mote  than  .sixteen  bits  of  precision  cannot 

be  m.ipped  into  the  sixteen  bit  format  in  .1  way  that  retains  their 
val  lie  . 

rtie  problem  of  precision  loss  can  m.inlfest  itself  at  any  point  in  the 
communications  system  at  wtilch  format  translation  is  carried  out.  In  the 
c.ise  of  systems  that  ate  based  on  an  intermediate  standard  format, 

these  ate  the  translations  into  and  out  of  the  IDK. 

Ptecislon  problems  can  be  avoided  as  items  ate  translated  into  the 
st.indard  format  by  selecting  standard  representations  that  ate  of  as  high 
a precision  as  the  t ept  »>sentat  ions  used  in  any  of  tlie  host  processors 
in  the  network.  Of  particular  concern  an-  numeric  data.  Tliete  is  no 
single  format  of  sufflcleriL  precision  to  represent  the  entire  range  of 
integers  or  teal  numbers,  however,  it  is  necessary  for  an  Intermediate 
format  to  be  able  to  represent  any  number  that  may  pass  botwi'en 
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i-nnniiinii-.it  inv,  ptoci-ssots.  By  cons  i.i.i  i tlm  ini.rn.il  t o tm  it  s nsci  by 
till-  iiosts  on  .1  m-tw<'tk,  in  U1K  ot  sii i t i r i i-n t 1 v iiii;b  prio  islui  i in  bt- 


chosen  rli.it  i-ontains  representations  |er  i 

> 

> 

> 

! lie  O ! e 

It  h il  1 1 

1 t ytie 

tli.it  m.iy  appe.it  in  the  network. 

Tin-  formats  of  the  receiving  hosts,  hewi 

V'-t  , i-.innoi  bi' 

a 1 t et 

ed  to 

provide  a t ept  esent.it  ion  for  «-vet  y lequit 

i-d  V.i  1 lie 

. Til  e i r 

d 1 1 a 

f o r m.i  t 

precision  is  fixed  by  h.irdwire  ind  t 

■ t t W 1 t 

e th.lt 

imp  I 

emen  t s 

|•xtensions  of  h.irdw.ite  defined  fotm.it  s. 

h.lt  1 

t t ansi  at 

■Its  .It 

this 

St  iftc  must  n.ldicss  the  possibility  ot  t i-i'e  i v in>>  diti  tb.it  -.innot  be 
a.iio|ii  1 1 f 1 y teptesented  in  their  .issoc  i .i  t erl  hosts. 

Tile  responsibility  of  the  rereiviiiR  t r .ins  1 a t o t is  to  distini'iiish 

betwr-en  data  Items  that  can  and  cannot  be  represented  completely  in  Che 
available  target  format.  Tlie  translator  can  report  incidence  of  problems 
in  precision  Co  the  receiving  host  process  through  a predefined 

protocol.  By  convention,  some  .attempt  at  representing  the  offending  data 
item  can  be  m.idc.  Once  notified,  it  is  the  responsibility  of  the 
receiving  process  to  respond  to  the  problem  through  discussion  with 
th«-  process  transmitting  the  datum. 

A scenario  typifying  the  problem  consists  of  a sending  host 

process  transmitting  16-bit  Integers  to  a processor  using  16-bit 
w>rds.  Since  th<‘  standard  format  must  he  able  to  transfer  all 
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em  s o f 

[).i  t ; 

1 Translation 

poss ib 1 e 

integer 

data  valui's. 

t ho 

in  termed  i.ite 

format 

f o r 

integer  s n.iy 

require  h 

■4  bits. 

Bee. ruse  of  the 

\in\ 

t.e  i t her  o f 

the  two 

ommun  ic.it  i ng 

pt  ocesso  r 

H lias 

knowledge  of 

t hi* 

internal  format 

USe<i 

by 

t hi'  o f h(>  r . 

Tlu’y  oach  sec  only  Llu'lt  own  foimat. 

Intt'gors  passed  to  the  receiving  translator  ate  examined  for 
tlreir  precision.  Items  wiiose  value  can  be  represented  in  tire  Ih-bit 
integer  format  ate  translated  and  tagged  appropriately  in  anticipation  of 
movement  into  the  host.  If  the  data  value  can  be  represented  as  a double 
precision  (32-bit)  integer,  the  translation  is  performed  and  the  item  in 
the  target  format  is  marked  with  its  description.  When  even  double 
precision  is  not  sufficient  to  represent  the  incoming  data  value  (i.e.  it 
carrlr-s  more  than  32  bits  of  precision)  some  information  must  be 
discarded.  An  algorithm  can  be  applied  to  the  Incoming  data  to  siMect 
32  bits  to  form  a double  precision  item.  This  item  is  then  tagged  with  its 
description  and  a mark  to  Indicate  that  the  translation  caused  a loss  of 
inforraat ion . 

The  philosophy  underlying  the  scheme  to  handle  lost  precision  must 
be  one  of  "make  do."  Firmware  or  software  mechanisms  to  support  multiple 
precision  can  offer  an  extended  target  format  for  the  receiving 

translator.  In  general,  though,  processes  communicating  across  a 

network  need  to  anticipate  tire  possibilities  of  sending  or  receiving 

precision  problems  and  cannot  be 
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nniiot  St  I'od  . Until  all  prnci'ssots  support  oarh  data  type  witn  oqual 
ptocision,  t ho  ptoblom  is  unavoidable. 


4.2  Fotmat  incompatibility 


Another  problem  inherent  to  data  translation  is  format 
incompatibility.  As  is  the  case  with  precision,  format 

incompatibility  problems  occur  when  data  items  of  a particular  type  and 
in  .1  particular  format  must  be  translated  Into  a different  format  for  the 
sami'  type.  However,  this  incompatibility  is  strictly  a function  of 
fomatt  iny,  scheme,  and  is  not  related  to  the  number  of  bits  allowr>d  for 
a value's  representation. 

The  fractional  values  it  is  possible  to  represent  "exactly"  with  a 
specific  number  of  bits  differ  from  one  formatting  scheme  to 
another.  Tire  problem  bas  been  described  in  a warning  that  accompanies  the 
discussion  of  the  automatic  format  conversions  that  occur  in  Ph/I. 


Tire  rules  for  arithmetic  conversion  specify  the  way  in  wtiich 
a value  is  transformed  from  one  arithmetic  representation  to 
another.  It  can  be  that,  as  a result  of  the  transformation, 
the  value  will  change.  For  example,  the  number  .2,  which 
can  be  exactly  represented  as  a decimal  fixed  point 
number,  cannot  be  exactly  represented  in  binary.  [<PI,1>  p. 
270.] 
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In  li  Isiiiss  in^  the  ptobloms  tiiat  sutiound  tbo  use  of  fractional  data, 
K('tni)?iian  .ind  PlauRor  state: 

Tile  t*Mson  is  simple:  "O.l"  is  not  an  exact  fraction  in 

a binary  machine  (in  much  the  same  way  that  1/1  is  not  an 
exact  fraction  in  a decimal  world);  its  nearest 
r j'pr  1‘sentat  ion  in  most  machines  happens  to  be  slightly  less 
than  0.1.  f <Kernlghan>  p.  91.) 


Another  r'xample 

Is  the 

fo  rmat 

incompat ib i 1 Ity 

t ha  t 

exist  s 

bet  ween 

d if  fer  ent  r epr  esentat ions 

for  binary  Integers. 

The 

"ra  in  us 

zero" 

value  in 

the  sign-magnitude 

and 

one' s 

complement 

formats 

cannot  be 

represented  in  twn' 

s complement. 

Til  1 s Is 

not 

a pr  cc i s ion 

pt  ob 1 em 

since  increasing  the 

n umb  e r 

of  bits 

al lowed 

for 

the 

target 

( t wo  ' s 

complement)  format  will  not  make  a difference. 

To  the  translator  connected  to  a heterogeneous  network,  the 
rounding  and  truncation  associated  with  refonnatting  fractional 

numeric  data  is  unavoidable.  The  Pl./l  approach  to  format 

incompatibility  is  to  print  a warning  in  the  language  reference 
manu.ll  . TTie  average  application  program  (programmer)  tolerates 

inaccuracies  in  the  least  significant  digits  of  calculated  results,  and 
only  wficn  exactness  is  required  is  the  issue  raised.  The  problem  cannot 
be  circumvented,  and  perhaps  the  most  reasonable  alternative  for  a 
translator  design  is  to  issue  a disclaimru  and  let  the  user  beware. 
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».T  Data  type  incompatibilities 

Tire  t h 1 1 li  problem  of  data  translation  is  t lu‘  tesolutiirn  of  data  t ypi' 
i ncorapa  t ib  i 1 i 1 1 os  . To  pt'rform  its  function,  tbi’  translator 
associated  with  c.u'h  net-ork  host  must  first  determine  thi'  type  of  an 
incorainR  data  item.  Tliat  information  is  then  used  to  map  the  input 

format  representation  into  an  output  format  representation  of  tin-  same  data 
type  and  value.  Precision  and  formiat  incompatibility  problems  exist  wfu'n 
the  input  for  a particular  data  type  supports  a higher  precision  or  can 
represent  different  values  that  the  output  format  for  that  same  data  type. 
A data  type  incompatibility,  on  the  other  hand,  is  the  complete  absence 
of  an  output  format  into  wiiich  items  of  a data  type  being  received  can 
be  mapped.  Wliile  an  example  of  a precision  problem  is  mapping 

lb-bit  integers  into  16-bit  integers,  an  incompatibility  is  trying  to 
move  12-blt  floating  point  numbers  to  a teletype.  Tire  information 

carried  by  the  floating  point  item  is  lost  to  the  teletype  because  it  has 
no  way  to  represent  any  data  type  but  characters.  In  the  context  of 
a computet  network  supporting  a diverse  set  of  processing 

environments,  data  type  incompatibilities  can  arise  between  hosts  of 
different  capabilities.  This  Is  particularly  true  in  the  case  of 
special  devices  or  unintelligent  network  hosts. 

As  with  problems  in  precision,  the  translator  must  address  data  type 
incompatibilities  on  a case  by  case  basis.  In  some 
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e i r c nnst  uu'c's , the  value  of  an  item  of  a d.ata  type  not  supported  by 

hardware  may  be  marginally  representable  in  the  format  for  a s(-rond  data 

t ypi>  tli.it  is  supported.  Floating  point  numbers  can  sometimes  bi‘ 

reason, ibly  reptesenteii  in  an  integer  format.  Boolean  v.ilues  <'.in  be 
represented  as  integer  <reros  and  ones.  In  other  si  tu.it  ions,  no 

intuitive  alternative  data  type  may  exist.  A hardware  unit  to  perform 

fast  Fourier  transforms  on  arrays  of  floating  point  numbers  may  not  be  .ible 
to  meaningfully  handle  any  non-niimer  leal  data  types. 

Unlike  precision  problems,  most  instances  of  data  type 

incompatibility  will  need  to  be  handled  by  the  translator.  This  is 
cert.iinly  true  in  the  case  of  messages  being  sent  to  unintelligent 
hosts  or  special  purpose  devices  with  limited  processing  power. 

Wlien  no  al  t er  iia  1 1 vi-  format  is  acceptable,  the  translator  must 
lnitiat<>  the  appropriate  fault  recovery  procedures.  /V.ain,  especially  in 
the  case  of  unintelligent  devices,  the  receiving  host  must  be 
shielded  from  extraordinary  conditions.  Wliether  the  necessary  action  is 
to  dispatch  a standard  message  to  tire  transmitting  process  to  inform 
it  of  the  problem,  or  to  Just  Ignoie  the  offending  message,  it  is  a task 
best  left  to  the  translator. 
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•t . A Siimm.u  y 

nil’ll'  ato  no  oloKant  solutions  to  tlie  ptobloms  of  data  torniat 
precision  and  format  Incompatibility.  A translation  sclierae  can  only 

patch  the  environment.  Tire  inability  to  overcome  some  conditions  of 

he  t e r Oft  ene  i t y remains,  Tlrese  problems  can  only  be  bandied  hv 

intelligently  written  applications  software  rimninjt  on  the  appropriate 
net  wot  k hosts  . 

An  applications  program  requiring  unusual  data  precision  or  using 
peculiar  data  types  can  announce  its  requirements  to  its 
cot r espondents . Difficulty  in  resolving  representation  problems  may 

force  communicating  processes  to  resrrrt  to  negotiation  ala  TKLNKT  or  to 
complete  abandonment  of  the  processing  task  at  hand.  Unless  a 

receiving  process  can  understand  the  type  of  data  being  sent  to  tt,  a 

standard  intermediate  format  and  format  translators  are  of  no  use. 

Wlien  the  type  of  the  data  is  recognized,  it  is  still  necessary  to 

consider  the  possible  problems  of  precision  and  format 

i ncompa  t Ib i 1 i t y . 
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Kx.imin.nt  ion  of  pns.sibli*  ,st  t a t i ps  for  impl  oncntar  ion  of  iata 

ti.anslators  rai.sps  issups  tliat  aro  transparent  to  tln'  t r ansi  .ir  ion 

scheme  itself.  Ttiese  Issues  include  tlie  interaction  hi'twien  tiie 
translator  and  other  mechanisms  tliat  perform  message  processing,  and  the 
additional  data  desctlbinp,  functions  requited  of  the  host. 


b.l  Format  translation  and  other  mr’ssage  handling  functions 

F.ven  without  data  format  translation,  successful  transmission  of 
information  between  machines  on  a network  requires  several  mess.ige 
handling  functions.  Hie  order  in  wtiich  these  functions  must  be 
applied  to  each  message  is  fixed,  and  tliis  order  is  depicted  in  Figure  b-l. 
Hie  following  subsections  briefly  describe  each  function  and  its 
relationship  to  the  iraplemeni at  ion  of  format  translation. 
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A.  Message  transmission  B.  Message  reception 

Figut  e 5-  1 . 


5.1.1  Message  packet i zing 


Since  communications  between  processors  may  take  the  form  of  very 
large  messages  (up  to  millions  of  bits  for  file  transfer),  a great  deal 
of  consideration  has  been  given  to  tin'  advantages  of  sending  long  messages 
one  portion  it  a time.  To  distinguish  between  messages  and  pieces  of 
messages,  the  rerm  "packet"  is  used  to  describe  message  fragments. 

Willie  messages  ate  the  unit  of  communication  between  processes,  these 
p.i>  kets  are  the  unit  of  data  that  moves  through  the  commun  Ic.tt  ion 

ibnet  wot  k . 
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The  mottvatInK  concern  over  packet  size  is  subnetwork 
performance.  Tiiere  are  tradeoffs  to  bo  examined. 

I^atge  packets  have  a lower  probability  of  successful 

transmission  over  an  error-prone  telephone  line  (and  this 
drives  packet  size  down),  wtiilo  overhead  considerations 
(longer  packets  have  lower  percentage  overhe.ad)  drive 
packet  size  up.  [<Crowthor>  p.  170.] 

Cases  can  be  made  for  the  optimum  packet  size  in  a particular  network 
environment;  the  governing  factors  have  different  manifestations  for 
loral  netw'rk.s  <larberl,  Frasorl,  Metcalfe2>  than  for  geographically 

distributed  networks  <Metcalfel,  Pouzin2>.  (An  especially  good 

exiim  in.at  i on  of  the  issue.s  for  packet  switching  networks  (i.e. 

ARPA.Nl.T)  can  be  found  in  <Metcal  fe  1> . ) It  seems,  however,  that 

regardless  of  the  network,  the  fragmentation  of  at  least  some  messages  into 
packets  is  necessary. 

ligure  5-1  indicates  that  only  fully  a.ssembled  messages  can 

irndergo  format  translation.  In  order  to  make  message  fragmentation  tire 

s impl  e-m  indt'd  partitioning  of  messages  int<r  several  packets  of  a fixed 
length,  the  semantic  content  of  the  mr'ss.rge  (Chapter  IIIl  must  remain 
tran.sparent  to  the  dls.assembly  proces.s.  Tire  desire  for  this 
transparency  lmpos«*s  an  ordering  on  the  two  functions  of  message 

packetizing  and  format  translation. 


Trmislntor  Impl  cmon  ta  t ion  Considerations 


Trying  to  perform  data  translation  r>n  message  fragments  to  be 

transmitted  causes  two  problems.  First,  data  reformatting  will 

likely,  but  iin  pr  ed  Ic  tab  1 y , change  the  size  of  the  data  items  in  .i 
message,  partially  defeating  tire  packet  Izlng  mechanism.  Secondly,  and  also 
likely,  the  message  may  be  split  in  the  middle  of  a data  item,  severely 
complicating  any  mechanism  attempting  ttr  reformat  that  item.  This 

prrtential  fragmentation  of  data  items  also  discourages  the 
translation  of  received  information  in  any  form  bvit  fully  reassembled 
messages . 


5.1.2  Flow  and  error  control 

Simply,  flow  control  is  the  process  of  insuring  that  the 
receiving  host  does  not  lose  Information  from  its  sender  at  any  time  due 
to  too  high  a rate  of  data  transfer.  Mismatches  of 

sender-receiver  pairs  can  result  in  packets  arriving  at  their 
destination  faster  than  they  can  be  Ingested,  over  wiiel  m ing  the 
receiver,  and  forcing  packets  to  be  discarded. 

Ktror  control  supercedes  the  normal  error  detection  for  packets 
between  subnctwrrrk  nodes,  such  as  parity  and  checksum  verification.  Tiu'se 
simple  types  of  errors  ate  common  wtien  transmitting  over  potentially  noisy 
communication  lines,  and  must  be  bandied  totally  at  the  .subnetwork  level. 
R.ither  , error  control  deals  with  the  problems  of  lost  or  duplicated 
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packets.  Packets  may  appear  or  dissappear  either  as  a result  or  the  simple 

checksum  or  parity  errors  (discarded  by  the  subnetwork),  or  flow  onrrol 

deficiencies  or  hardware  failure. 

Both  error  and  flow  control  are  functions  concerned  with  • lu' 
movement  of  data  packets  over  the  commun  ic.rt  Ions  subnetwork.  These 

functions  handle  the  blocks  of  data  that  move  bi-twiuui  subnetwork 

nodes,  and  .so  must  be  performed  at  a level  betwi'en  the  actual  transfer  of 
message  bits  across  the  subnetwork  and  message  paeketizing. 


5.1.3  Knc  t ypt ion 

;\s  the  use  of  computers  for  the  storage  and  manipulation  of 
classified  (military),  proprietary  (industrial)  and  confidential 

(pr'tsonal)  information  increases,  the  need  for  mechanisms  for  secure  data 
handling  also  Increases.  Particularly  vulnerable  to  breaches  of 

information  security  are  the  communications  paths  between  the  nodes  of  i 
computet  netwr>tk.  Often  these  paths  may  be  inter  - laborator  y , and  so  their 
physical  security  cannot  be  assured.  Unauthorized  access  to 

information  being  carried  in  communications  links  can  be  thwarted  by  d.'ita 
encryption.  Tlie  Intent  of  this  section  is  to  relate  encryption  to  other 
network  message  handling  functions,  in  particular  format  1 1 an.sl  at  ion . 
Kncryption  techniques  and  associated  protircols  are  not  discussed,  and  for 


tht.-se  the  reader  is  referred  to  <Ki-nt>. 
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Messaj’o  eneryption  ran  bo  broken  up  into  two  separate  r at  i>got  i --s  . 
First  is  t lie  encoding  of  the  data  field  (Figure  1-2)  of  t lie  messag(“. 
'Iliis  field  contains  the  information  the  sending  process  is  trying  to 
transmit  to  the  receiving  process.  Presumably,  this  would  be  the 

primary  target  for  unauthorized  access.  Tltc  other  category  is  the 

encryption  of  the  packet  control  information  that  must  accompany  the  data 
as  it  traverses  the  network.  Precisely  what  information  must  be  passed 
with  the  text  of  a message  and  how  it  may  be  encrypted  for  a give  network 
is  partly  dependent  on  the  implementation  of  of  the  communications 
subnetwork.  The  protection  of  that  information  is  not  considered  here. 
Tills  section  is  only  concerned  with  the  encryption  of  the  actual  text  of 
the  message. 

Protection  modules  that  perform  data  enc r ypt ion/ dec r ypt i on  must  be 
at  the  level  following  format  translation  for  information  transmission 
.ind  conversely  the  level  before  format  translation  for  information 
r ecept ion  . 


With  respect  to  functionality,  protection  modules  ate 

constrained  to  be  below  the  portion  of  the  communication 
system  that  engages  in  syntactic  processing  of  message 
contents .. .Wi th  respect  to  output  from  the  host,  encryption 
can  be  performed  only  after  such  tr  .msfotmat ions  as 
dev  ice-spec  I f Ic  code  conversion,  white-space 

optimization,  and  formatting.  With  respect  to  input  to 
tiu'  host,  messages  must  be  deciphered  before  such 

transformations  as  canon  ical  Izat  ion , break  characti'r 
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(ietection,  ptaso-klll  piort'.ssing,  translation,  nsrapn 
spqncnoo  processing,  character  echoing,  and  liigli  priority 
message  recognition  can  be  performed.  [<Kent>  p.  65.] 

A format  translator  must  receive  unencoded  message  text  in  order  to 
perform  the  semantic  analysis  necessary  for  data  reformatting. 

However  , a mechanism  for  message  packet Izing  must  only  be  able  to 

coiurt  and  partition  the  bits  in  the  data  field  of  a message.  It  need  not 
have  access  to  that  field  in  its  unencrypted  form.  Similarly,  flow  and 
error  control  functions  require  access  only  to  the  unencoded  header  and 
trailer  fields  of  packets.  Figure  5-1  depicts  the  funct’onal  level 

of  message  handling  appropriate  for  the  encryption  process. 


5.2  implementation  of  message  processing  functions 

There  are  two  schools  of  thought  on  the  implementation  of  message 
handling  functions.  Both  philosophies  view  each  network  site  as  having  a 
host  processor  connected  to  ,a  subnetwork  node.  Simply  stated,  one  side 
atguf's  that  all  message  processing  should  he  transparent  to  the 
coramiin  icat  ions  subnetwork.  Tlie  other  argues  that  all  message 
processing  should  be  transparent  to  the  hosts.  As  a result,  networks  have 
appeared  that  reflect  both  philosophies  <Metcalfel,  MetcalfeZ,  I’ouzin2'>. 

Tl'e  principle  advantage  to  performing  all  message  processing 
operations  in  the  host  is  the  simplification  of  the  subnetwork  node. 
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Ri’movinK  any  host-specific  functions  from  the  subnetwork  level  permits  each 
subni'twork  node  to  be  exactly  like  every  other.  This  duplication 
facilitates  subnetwork  maintenance  and  enhancement.  Perhaps  more 

importantly  however,  limiting  the  subnetwork  function  to  delivering  bits 
recur  ately  iacilitates  the  interconnection  of  the  subnetworks  of  rlifferent 
networks.  Tire  incorporation  of  network  specific  protocols  into  the 
subnetwork  nodes  necessarily  complicates  the  mechanism  that  moves 

mi’ssages  between  subnetworks.  This  thought  is  expressed  strongly  in 
a paper  about  CIGALF,,  the  subnetwork  for  the  CYC1.ADE.S  packet  switching 
netwot  k . 


It  is  clear  that  the  CIGALF.  transparency  is  its  major  trump 
to  provide  a communication  service  between  existing 
systems.  Any  additional  well-wishing  function  tied  with 
the  external  world  is  likely  to  be  incompatible  and 
detrimental  to  a good  service.  in  particular, 

communications  networks  studded  with  all  sorts  of  bells 
and  chimes  will  end  up  as  one  of  a kind  networks,  unable  to 
communicate,  unless  an  ad  hoc  kludge  be  interposed  so  that 
,hey  at  last  exchange  packets.  [<Pouzin2>  p.  159.) 


Another  argument  for  performing  all  message  handling  functions  in  tlie 
host  is  based  on  data  security.  Performing  encryption  and  decryption  in 
the  netw)rk  hosts  Is  necessary  to  insure  that  no  unencoded  data  need 
ever  leave  the  host.  Tlie  importance  of  this  consideration,  however,  is 
rainlmlz«‘d  if  the  hardw.ite  performing  message  processing  is  considered  to 


be  merely  an 

extension 

of 

the 

host 

processor,  as  would  be  an 

1 /o 

channel  , for 

ex  am  pi  e . 

The 

host 

and 

the  message  processor  c.in 

be 
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physic.ally  closo,  and  so  they  and  tho  corainim  lent  ions  link  between  them 
can  enjoy  the  s.ime  level  of  physic. il  security. 

Tliere  .ite  also  two  arguments  for  performing  the  message  handling 
functions  at  the  suhnetwirk  level.  First  is  that  regardless  of  ttie  host 
involved,  the  message  processing  operations  ate  essentially  the  same  for 
every  site  in  a network.  A programmable  subnetwork  node  can  be  customized 
for  each  host,  while  the  bulk  of  the  software  can  be  written  one  time 
in  a single  language  compiled  for  the  nodes.  This  eliminates  the  need 
to  redevelop  message  handling  routines  at  every  host.  It  .also  greatly 
.simplifies  the  a.idltion  of  a new  host  to  the  network. 

"nie  second  argument  is  that  keeping  network  related  functions  out 

of  the  host  minimizes  the  Impact  inclusion  in  a network  m.iy  have  on  a 
host  software  system.  This  facilitates  the  use  of  the  network  by  users  at 
Installations  with  either  limited  system  expertise,  or  limited 

processing  power  or  flexibility. 

Although  the  arguments  seem  irreconcilable,  a compromise  that 

setrras  to  be  a natural  conclirsion  to  the  controversy  has  been  suggested 
<.M,tnn  i ng>  . Di.scussion  to  date  has  centered  on  the  partitioning  of 

functions  between  a subnetwork  node  .rnd  a host-resident  network 
control  program  (NCP).  Providing  a separate  hardware  level  expressly  for 
message  ptoce.ssing  Is  responsive  to  both  sides  of  the  argument.  Figure 
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'i-J  ili'pirts  siu'h  a con  f ignt  at  ion . Tiio  additional  liardwato  would  iiost 
a format  translator,  message  packetizer,  flow  and  error  control 
modules  and,  wlien  required,  a protection  module  for 

i ncr  ypt  ion 'dec  r ypt ion . 


NETWORK 


\ I I II 

>1  |<— >1  |< 

I I II 

NODE  MESSAGE  HOST 

PRCXIESSOR 


DEDICATED  MESSAGE  PROCESSOR 
FIGURE  5-2. 


Tire  introduction  of  a "message  ptocessot"  does  not  reniier  the 
network  functions  totally  transparent  to  the  processing  systems  ot  the 
netv^rik  hosts.  F-ach  host  must  still  exchange  data  description 
Information  with  its  associated  format  translator  (Chapter  III).  Tliis 
impact  to  the  host  system  cannot  be  avoided.  However,  with  all  other 
message  processing  being  performed  at  a separate  level,  the  NCP 

required  can  be  relatively  small.  An  area  requiring  further 
investigation  is  how  data  description  can  best  be  supported  hy  an 
existing  host  systrrm.  (See  section  5.1.) 
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Tiu-  suhnotwntk.  node,  on  tlie  other  hand,  cnn  bo  f nnr  t ion.il  1 y 
limited  to  supporting  bit-passing  protocols.  Tliis  rerpiires  no 
liost  - spec  i f ic  inform.rtion  and  so  each  subnetwork  node  can  be 
interchangeable  with  every  crthei  . Such  a nodi'  has  already  been 

suggested  for  some  local  netw>rks  <Moc kape t r 1 s> . 

While  not  all  of  the  hardware  and  software  at  the  additional 
level  can  be  standardized  for  all  netwrrrk  hosts,  it  may  be  possible  to 
imit  host -dependent  information  Co  the  data  translator.  It  is  the 

differences  in  the  data  rerepr esentat Ions  of  the  hosts  that  force 
message  handling  functions  to  be  specific  to  the  host  with  wtiich  they  ate 


assoc  iated  . 

Tliese  are  the  same 

d if ferences 

that 

necess  i rate 

a 

net wir  k-wide 

data  translation  scheme 

in  t li  e 

first 

p 1 ac  e . 

The 

d i sc  uss ions 

on  data  description 

(Chapter  III) 

and 

on  data  format 

precision  and 

incompatibility  (Chapter 

IV)  underscore 

this  need 

f OI 

translator  customization. 

Simplifying  the  subnetwork  node  by  separating  the  message 

handling  functions  from  It  satisfies  an  argument  for  moving  all 
operations  into  the  host.  Moving  all  but  a minimum  of  operations  out  of 
the  host  and  into  a separate  module  that  can  be  mostly 
standard  ized  , is  responsive  to  the  arguments  for  moving  these 

functions  into  the  subnetwork  node. 
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S.'i  SiippottinR  data  description  in  tlie  host 

If,  as  suggested  in  Section  S.2,  tire  majority  of  tiu’  mess.ige 

iiandlinR  functions  ate  performed  in  a separate  message  processor,  then  a 
host  need  only  be  able  to  move  messages  back  and  fortli  between  the 

network-using  processes  it  supports  and  the  message  processor.  In 

particular,  a host-resident  network  control  program  (SCP)  must 

interface  applications  programs  with  the  format  translator. 

Section  1.1  discussed  methods  of  passing  data  description  between 

communicating  modules.  i)ne  of  the  two  mechanisms  described  involved  a 


prear rangement  of 

tire 

seman  tic 

mean  ing 

of 

a bit 

St  t earn 

by 

e i ther 

nogot iat ion  or 

by 

system  design. 

Tlie 

other  was 

a data 

i tern 

iiRging 

scheme,  and  this 

wa  s 

suggested 

as  the 

mot  e 

f 1 ex ib  1 e 

of  the 

t wo 

in  a 

general  purpose  processing  environment. 

An  important  consideration  is  the  effect  the  implementation  of  a data 
tagging  scheme  would  have  on  the  host  operating  system  and  user 

cirramunity.  Minimizing  tlie  Impact  on  a Cfrmmunlty  joining  a netwirk 
makes  the  netwirk  a more  attractive  resource.  In  line  witli  this 

philosophy,  a mechanism  has  been  suggested  to  facilitate  the 

l.mpl  ementat  ion  of  data  description  through  tagging  between  host  and 


format  translator. 
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Intetptocpss  comimin  lent  ion  ran  be  considered  a rase  of  data  input  .ind 
output  <Ho,ire>.  Tfie  destination  process  receives  the  output  from  a 
sending  process  in  response  to  ,an  input  request.  The  implementation  of 
such  a mechanism  could  be  modelled  after  the  I/O  routine  pack.ig.-s 
currently  avail.able  in  single  process  environments  for  languages  such  as 
FORTRAN  IV,  ALGOL  and  PL/I.  These  languages  perform  T /O  on  a user 
specified  transmission  list  and  with  a user  specified  format  (i.e.  the 
FOR.MAT  statement  in  FORTRAN,  and  the  KDiT  option  in  PL/1).  TVie  user 

could  request  I/O  as  he  would  for  a locally  resident  process  with 
wtiich  he  wanted  to  communicate.  Wlien  some  higher  authority  determines  that 
the  referenced  process  is  active  on  some  other  network  host,  the  networ'. 
I/O  control  progt.am  could  combine  the  user's  description  of  the  data 
with  the  data  itself  to  form  a tag-based  data  stream.  This  scheme  has 
tile  advantages  of  being  familiar  to  most  application  programmers  .rnd 
a straightforward  "add-on"  to  existing  systems. 
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intent  of  this  lepott  wns  to  examine  possible  mechanisms  for 
moving  data  between  dissimilar  processors  and  to  identify  the  mechanism 
most  responsive  to  the  requirements  of  a heterogeneous  computer  network. 
Tuee  data  format  translation  schemes  were  reviewed,  and  from  these,  the 
use  of  an  intermediate  data  format  was  selected.  Alternatives  for  the 
intermediate  formats  were  also  discussed  and  on(>  was  proposed  for  general 
use  . 


Some  problems  are  inherent  to  data  translation  and  are  independent  of 
the  translation  scheme.  Several  of  these  were  discussed  including  the 
passing  of  data  description,  data  type  and  format  incompatibilities  and 
loss  of  data  precision. 

Although  implementation  con.sidet  at  ions  were  pre.sented,  the  results  of 
a sample  implementation  were  not.  Tlie  unavailability  of  i suitable  network 
testbed  made  such  an  implementation  infeasible. 

Tie  i-ffort  involvi'd  in  preparing  this  report  will  be  Justified  by  an 
implementation  of  tin*  mechanism  di-scribed.  We  hope  this  document  will 
serve  as  the  foundation  for  such  Implementations  on  both  local  anil 
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Ki’onr.iph  ical  1 y distrihutcd  networks.  Several  topics  tliat  require  furtliet 
invt'st  iRat  ion  ate  discussed  in  tite  following  section. 


t>.  1 Areas  for  future  study 

Many  of  the  problems  sut rounding  the  use  of  an  intermediate  data 
tormat  based  network  communication  scheme  have  not  been  solved.  The  last 
three  chapters  have  pointed  out  areas  that  require  further  investigation. 
Ttiese  include  mechanisms  for  data  description  between  hosts  and  network 
translators,  the  data  format  representation  best  suited  for  use  in  an  IDF, 
and  strategies  for  data  format  error  recovi-ty.  Tire  next  subsiu- 1 ions 
suggest  other  areas  that  still  must  be  studied. 


b.I.l  The  contextual  meaning  of  data 

In  certain  processing  environments,  the  appe.arance  (if  paitlcul  ir 
values  can  somc't  imes  cause  a special  effect.  Tliat  effect,  'jiiilo  ti  ■ 
by  the  data  item,  is  a predetermined  reaction  of  th*'  <'iiv  i r on;»  > ■ ■ r 
value.  In  the  case  of  such  items,  passing  the  t vp.-  i- 
representation  with  the  data  bits  to  .1  second  env  i r irnmeii  ‘ i» 
sufficient.  Tire  meaning  of  the  item  in  thi-  . o',-  r 


environment  must  also  b<-  sent. 
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Chapter  III  discussed  the  problein  of  "passing  tlie  semantle  description 
of  a string  of  data  bits."  Mechanisms  for  describing  data  for  this  purpose 
were  presented  in  Section  3.1.  The  contextual  meaning  of  data  is 
independent  from  Issues  of  data  type  and  format.  It  is  separate,  too,  from 
tile  formatting  problems  presented  in  Chapter  TV.  The  ability  to  move  data 
values  across  the  barrier  of  heterogeneity  only  uncovers  tiie  problem  of 
passing  the  effect  those  values  have  on  a processing  environment. 

The  difficulty  encountered  v^en  moving  lines  of  text  from  one  computer 
system  to  another  is  an  example  of  a problem  conveying  contextual  meaning. 
Two  systems  invariably  disagree  on  the  interpretation  of  format  control 
characters.  A specific  instance  is  the  use  of  tlie  horizontal  tab 
character.  One  system  may  take  its  appearance  to  mean  pad  the  current  line 
with  spaces  until  the  line  character  count  is  the  next  multiple  of  eight. 
Another  system  may  space  to  the  next  multiple  of  five,  wti  1 1 e still  a third 
may  attribute  no  special  meaning  to  it  at  all.  Characters  causing  similar 
problems  Include  form  feed,  vertical  tab  and  carriage  return. 

The  implementation  of  TELNET  recognizes  the  problems  of  passing 
contextual  meaning.  The  effect  a special  character  has  in  the  receiving 
environment  can  be  established  by  pr eat rangement , and  a TELNET  negotiable 
option  for  the  disposition  of  many  such  text  formatting  characters  is 
described  in  <ARPA> . 
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It  Is  Important  to  understand  that  passing  the  tab  character  in  itself 
Is  a different  problem.  The  concern  here  is  accurately  passing  the  effect 
the  tab  had  in  the  system  in  which  it  originated. 


Special  text  characters,  however,  are  only  a small  part  of  the  set  of 
data  values  that  carry  contextual  meaning.  Crucial  to  process  coordination 
in  distributed  systems  will  be  the  meaningful  transmission  of  process 
control  and  synchronization  primitives. 


Standardization  of  data  formats  and  control  semantics  will  be 
essential  for  successful  communication.  While  we  do  have 
standards  at  the  very  lowest  levels  of  data  communication, 
such  as  conventions  for  transparent  binary  and  character 
codes,  we  do  not  see  similar  standards  even  for  such 
primitives  as  floating  point  numbers,  much  less  for  records, 
files,  or  objects.  The  situation  for  control  primitives  is 
much  worse.  Description  of  processes.  Interrupts,  and 
related  mechanisms  is  presently  very  difficult  to  communicate 
across  computer  boundaries  except  by  specialized,  ad  hoc 
methods.  [<Levln>  p.  16.] 


Until  a mechanism  is  provided  to  pass  the  contextual  meaning  of 
special  data  values,  communication  between  cooperating  processes  will 
continue  to  be  supported  only  on  a case  by  case  basis.  The  requirements 
for  such  a mechanism  must  be  formally  described,  and  the  possibility  of 
using  an  extension  of  an  intermediate  format  based  strategy  investigated. 
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6.1.2  Passing  pointers 

One  of  the  data  types  eligible  for  exchange  between  processes  is 
address  pointers.  The  use  of  pointers  in  certain  data  structures,  such  as 
linked  lists,  Is  indispensable,  and  a network  data  communications  scheme 
must  support  their  movement. 

If  the  communicating  processes  share  a single  address  space,  then 
passing  pointers  engenders  no  special  problems.  This  approach  is  only 
appropriate  for  homogeneous  computer  networks,  and  has  been  implemented  at 
CMU  for  local  computer  networks  <Swan,Wulf>.  Passing  pointers  does  present 
a problem,  however,  in  a heterogeneous  environment.  When  two  communicating 
processes  exchange  information,  a representation  of  the  data  being 
transferred  is  moved  from  the  address  space  of  the  sending  process  into  the 
address  space  of  the  receiver.  The  position  in  the  .address  space  of  data 
items  to  be  referenced  with  pointers  is  Important.  Between  the  relocation 
of  the  data  in  the  virtual  memory  of  the  receiving  process  and  the 
potential  differences  in  the  addressing  schemes  involved,  any  pointers  that 
move  between  processors  will  have  to  have  their  values  adjusted.  However, 
the  necessary  adjustment  cannot  merely  be  the  calculation  of  a fixed  offset 
within  the  virtual  memory.  Because  different  data  types  and  formats  are 
different  lengths  in  different  processing  environments,  the  adjustment  of 
the  pointer  value  will  depend  on  the  kinds  of  items  it  references. 
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Conclusions 

There*  appear  to  be  tvw  approaches  to  this  problem.  The  sending 
process  can  pass  entire  data  structures  so  that  any  pointer  used  would  be 
"local"  to  the  structure  In  a single  message.  By  building  polnte*rs  as 
offsets  from  the  beginning  of  the  structure,  the  receiving  translator  could 
map  pointers  Into  the  values  appropriate  for  the  reformatted  data. 

A different  approach  Is  necessary  for  passing  pointers  that  reference 
data  structures  too  large  to  be  feasible  or  practical  to  move.  In  this 
case,  messages  may  contain  pointers  Into  the  sender's  address  space  for  use 
by  the  receiver. 

The  principle  difficulty  In  supporting  this  kind  of  passed  pointer  is 
the  possible  relocation  of  the  sending  process.  The  movement  of  absolute 
memory  addresses  can  be  avoided  by  passing  offsets  into  the  address  space 
of  the  sending  process.  This  will  allow  the  relocation  of  the  of  the 
sending  process  within  a single  processor.  However,  If  that  process 
migrates  to  another  processor,  the  appearance  of  Its  address  space  (ie. 
the  length  and  format  of  the  data  Items  contained  In  It)  will  change. 
Maintaining  passed  pointers  as  simple  offsets  In  this  case  will  not  be 
suf  f ic lent . 


6.1.3  Passing  programs 

One  of  the  proposed  uses  for  network  based  systems  Is  reliability 
through  redundancy  and  increased  performance  through  load  sharing. 
Realization  of  either  of  these  goals  In  a general  way  requires  a mechanism 
to  support  the  movement  of  program  code  through  the  network. 
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One  strategy  is  to  pass  the  source  text  of  a high  level  language*  for 
wlilch  each  host  has  a compiler.  Code  transportability  has  been  attempted 
in  this  way  by  the  specification  of  standard  versions  of  FORTRAN  and  COBOL. 
Another  possibility  is  the  use  of  an  intermediate  programming  language  to 
bridge  the  gap  between  compilers  and  the  object  code  of  different 
processors.  The  application  of  these  techniques  to  program  passing  needs 
to  be  considered  . 


6.1.4  Negotiating  the  IDF 

The  intermediate  format  to  be  used  to  represent  data  items  as  they 
move  between  IDF  translators  must  be  fully  general,  allowing  any  data 
values  represented  in  one  host  to  be  transferred  to  another.  In  some 
cases,  however,  the  use  of  general  formats  may  be  unnecessary  and 
inefficient,  and  an  ability  to  select  format  options  for  intermediate  data 
formats  may  be  useful . 


Probably  the  most  frequent  use  of  such  a mechanism  would  be  to 
facilitate  data  transfer  between  similar  processors  in  a network.  The  IDF 
is  intended  to  serve  as  a common  language  for  data  transfer.  When  machines 
that  use  the  same  data  formats  for  identical  sets  of  data  types  wish  to 
exchange  information,  no  intermediate  format  translation  need  nor  should 
occur . 
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One  way  to  provide  the  additional  flexibility  offered  by  intermediate 
format  options  Is  through  the  use  of  option  negotiation.  By  using  a 
predetermined  protocol,  two  communicating  translators  may  agree  to  use  a 
non-standard  Intermediate  format  or  possibly  to  perform  no  data 
reformatting  at  all.  The  potential  uses  and  Implementation  strategies  for 
such  a facility  Is  yet  one  more  area  that  requires  further  study. 
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